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Foreword 



Forecasting is a hazardous yet essential enterprise, in demography as in other 
fields. This volume contains contributions to the theory and practice of 
forecasting of mortality in the relatively favourable circumstances in developed 
countries; that is, when extensive historical data are available, at least in 
aggregate form, and when the economic, epidemiological and social contexts 
are understood to the extent that current knowledge permits. In this case it is 
possible to focus on the central problem of first finding an apt description of 
the past and then combining this historical knowledge with a variety of 
considerations, many subjective, to make a forecast. An apt description of the 
past is increasingly coming to mean a quantitative summary as given by a 
statistical model in the form of a regression or time series. The subjective 
considerations are essentially judgemental factors based on more or less expert 
opinion. 

Even though circumstances are relatively favourable in low-mortality 
developed countries, the advantages are only relative, and there remain serious 
impediments to the process of formulating a forecast. To an extent, this is due 
to difficulties in finding a generally acceptable methodological approach, and 
dilemmas in formulating assumptions for models, and partly due to 
deficiencies in the data such as a lack of generation and cause-of-death or 
longitudinal survey data on linked mortality and disability. It is therefore 
useful to discuss the experience collected by scientific institutes and statistical 
practice with a view to developing improved forecasting techniques. 
Considerations such as these prompted the organisation of the workshop 
'Forecasting of mortality in developed countries: searching for better methods 
and realistic assumptions'. 

The workshop was an initiative of the Netherlands Interdisciplinary 
Demographic Institute, and took place at its premises on 5 September 1997. 
The participants included representatives of several Dutch and European 
organisations, which have done much of the work on forecasting of mortality: 




the Netherlands Interdisciplinary Demographic Institute (NIDI), the French 
National Demographic Institute (INED), the National Institute for Public 
Health and the Environment (RIVM, the Netherlands), some departments of 
the Erasmus University in Rotterdam, the Population Research Centre of 
Groningen University (PRC RUG), Statistics Netherlands (NCBS), and the 
Statistical Office of the European Communities (Eurostat). By inviting experts, 
we hoped to benefit from a range of experience in the field. Three invited 
speakers were present: professor Nicolas Brouard (the Erench National 
Demographic Institute (INED), Paris, Prance), professor Christopher 
Heathcote (the Australian National University, Canberra, Australia) and Harri 
Cruijsen (Eurostat, Euxembourg). The two first guests are statisticians, both 
oriented theoretically in their work on mortality and health. The third guest, a 
mathematician, represents the statistical practice in Europe and has a deep 
interest in demographic projections. 

The workshop was a platform where we summarised and shared relevant 
experience, as well as explored future directions in making forecasts of 
mortality in low-mortality countries in Europe, in particular in the 
Netherlands. Our discussions were creative and constmctive, inventive and 
stimulating. They were also well stmctured and complete, meaning they were 
carefully prepared, so that we decided to put our thoughts on paper. The 
workshop was apparently a strong incentive to many of us, as today we are 
able to present a coherent collection of 12 papers written in response to what 
we talked about at that time. In this way, the reader gets a book with a state- 
of-the-art overview of the works done recently on mortality forecasting in 
developed countries of western Europe, especially in the Netherlands, and of 
prospects for mortality forecasting in these countries. 

Our book is meant for all scientists interested in forecasting mortality and aims 
to bring together contributions not only from demography but also from 
official and mathematical statistics and epidemiology. Our belief is that an 
interdisciplinary approach has much to offer. Techniques from mathematical 
statistics and econometrics can provide useful descriptions of past mortality. 
The naive forecast obtained by extrapolating a fitted model may give as good a 
forecast as any but forecasting by extrapolation requires careful justification 
since it assumes the prolongation of historical conditions. That is, stationarity 
is assumed. On the other hand, whilst it is generally accepted that scientific 
and other advances will continue to impact on mortality, perhaps dramatically 
so, it is impossible to quantify more than the outline of future consequences 
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with a strong degree of confidence. The decision to modify an extrapolation of 
a model fitted to historical data (or conversely choosing not to modify it) to 
obtain a forecast must therefore be strongly influenced by subjective and 
judgemental elements, with the quality of the latter dependent on demographic, 
epidemiological and indeed perhaps more general considerations. Thus the 
thread mnning through the book reflects the necessity of integrating 
demographic, epidemiological and statistical factors to obtain an improvement 
in the prediction of mortality. Included are the following issues: statistical 
models in both the descriptive and predictive senses, assumptions about 
changes in future mortality and making explicit judgmental and subjective 
considerations, and satisfying the needs of users by incorporating issues such 
as health and morbidity into forecasting. 

There are four parts to the book: an introduction (Part 1), theoretical 
perspectives on the forecasting of mortality (Part 2), from theory to practice 
(Part 3), and issues for the future (Part 4). 

Part 1 consists of two review contributions. Of these. Chapter 1 reviews 
demographic methods of forecasting mortality and includes a discussion of 
time series and other parametric models that have developed a substantial 
literature in recent years. Chapter 2 describes epidemiological models which 
incorporate consideration of disease processes and related risk factors and their 
use in forecasting mortality. 

The material in Part 2 is more mathematical in nature. Chapters 3 and 4 deal 
with regression modelling of what are called mortality surfaces. These 
surfaces are functions of time and age that are measures of mortality and that 
can be estimated by known statistical methods. Chapter 5 brings together facts 
about the Gompertz distribution and related matters. Chapter 6 treats the 
problems of modelling mortality at the oldest old ages, again using regression 
techniques, and including a comparison of demographic models for mortality 
over age 80. 

The focus of the contributions in Part 3 is on practical matters. Chapter 7 
discusses the role of period, cohort and cause-of-death effects in the 
forecasting of mortality. Chapter 8 adopts an epidemiological approach in 
which mortality is considered from the point of view of combining risk factor 
prevalence and related disease risks. Models used for official forecasts of 
Dutch mortality are presented in Chapter 9, and Chapter 10 is a critical review 




of the methods and assumptions used in obtaining the latest mortality forecasts 
in the countries of the European Union. 

Chapter 11 in Part 4, Issues for the future, reviews mortality models 
formulated using concepts belonging to various theories of human ageing. 
Hopefully, some of the models representing this new line of research will be 
used in forecasting mortality in the years to come. Finally, Chapter 12 in Part 
4 summarises the content of this book and focuses on the requirements of 
mortality forecasting from the perspective of assumptions, models and data. 
The discussion is influenced by keeping in mind various forms of the demand 
for information on future levels of mortality, that is, demand due to population 
forecasting, health forecasting and scientific analyses. This chapter ends by 
stressing the necessity of integrating the tools and perspectives of the 
disciplines of demography, epidemiology, and statistics in order to achieve 
improved forecasts of mortality. 



The editors 




Preface 



This book is the result of several activities related to forecasting mortality and 
health in low mortality countries of Europe in the 1990s. Many of these 
activities were completed with the financial support of the European 
Commissions’ Directorate-General V (Employment, Industrial Relations and 
Social Affairs), the Netherlands Interdisciplinary Demographic Institute 
(NIDI), and the Dutch Institute of Public Health and the Environment 
(RIVM). 

Several people helped us at different stages of this project contributing to the 
completion of this book. Eirst of all, thanks are due to all those whose views 
and ideas inspired a great deal of the works completed in this project and who 
also enabled us to gather the necessary data for the analyses presented here: 
Erance Mesle, Jacques Vallin and Nicolas Brouard in Erance, Graziella Caselli 
and Valerio Terra Abrami in Italy, Jens-Kristian Borgan in Norway, James 
Vaupel in Germany, and Harri Cruijsen in the Netherlands (previously at 
Eurostat in Luxembourg). We thank Kirill Andreev (Germany) who prepared 
the oldest-old data and Jeroen Berkien (the Netherlands) who helped us 
restructure certain data. Our greatest debt is to an anonymous referee who 
reviewed the manuscript on behalf of ESPO and in a handful of priceless 
remarks and suggestions guided the authors in their revisions and the editors in 
editing this volume. We received invaluable support from Evert van Imhoff 
(the Netherlands) who read and commented on several chapters. Leo van 
Wissen (the Netherlands) helped us with the organisational aspects of this 
project. Many thanks are due to Willemien Kneppelhout and Anne Mark for 
their professional approach and creativeness in editing our English. We thank 
Tonny Nieuwstraten who with devotion and passion prepared the final lay-out 
of this volume, Leon Vermeulen who invented the electronic procedures for 
this publication, and Jacqueline van der Helm who had the final responsibility 
for the ESPO style of this volume. 




List of Authors 



loop de Beer is an econo mi st and chief of the Population Forecasting Unit at 
the Population Division of Statistics Netherlands. 

Anneke van den Berg Jeths is a sociologist working as a senior researcher on 
the future of health and health care in the Netherlands. She is one of the 
project leaders in the project “Public Flealth Status and Forecasts” at the 
National Institute of Public Flealth and the Environment (RIVM) 

Lech Boleslawski is a statistician and demographer, and chief of the Population 
Forecasting Section at the Population Division of Statistics Poland. Fie is 
responsible for the official mortality forecasts for Poland. 

Alinda Bosch is a demographer working in the field of mortality, migration 
and reproductive health at the Netherlands Interdisciplinary Demographic 
Institute. 

Harri Cruijsen is a mathematician and project leader in the field of 
demographic projections for the Statistical Office of the European Commission 
(Eurostat). He is currently attached to Statistics Netherlands. 

Harold Eding is a demographer and works as a researcher in projects on 
European demographic projections at the Netherlands Interdisciplinary 
Demographic Institute. 

Peter Ekamper is a demographer and economist, working as a senior 
researcher in the field of demographic forecasting at the Netherlands 
Interdisciplinary Demographic Institute. 

Marianne van Genugten is a mathematician and senior researcher working on 
public health forecasting at the National Institute of Public Health and 
Environment, the Netherlands. 




XVlll 



Christopher Heathcote (Ph.D.) retired as a professor of mathematical statistics 
at the Faculty of Economics and Commerce, Australian National University, 
in 19%. He still works at the university as a visiting fellow and emeritus 
professor. 

Tim Higgins is a statistician working on mortality forecasting in the Australian 
Government Actuary's Office, Canberra, and as a research student at the 
Australian National University. 

Guus de Hollander was trained in biology, environmental epidemiology, 
toxicology and science philosophy. He has been working in the field of 
environmental health impact assessment, risk assessment and management, 
both as a researcher at RIVM and as a scientific secretary to the Health 
Council of the Netherlands. 

Rudolf Hoogenveen studied applied mathematics, specializing in operations 
research and system theory. He has been working on the development and use 
of life table-based models in epidemiology and public health at the National 
Institute of Public Health and the Environment (RIVM), the Netherlands. 

Wim van Hoorn is a statistician and senior associate working at the Population 
Porecasting Unit of the Population Division of Statistics Netherlands. He 
prepares official mortality forecasts in the Netherlands. 

Corina Huisman is a demographer working on mortality forecasting and other 
demographic processes in the Demographic Porecasting Research Cluster at 
the Netherlands Interdisciplinary Demographic Institute. 

Ewa Tabeau (Ph.D.) is a demographer and statistician working as a project 
leader in the field of quantitative and qualitative research on mortality, health 
and longevity in the Demographic Porecasting Research Cluster at the 
Netherlands Interdisciplinary Demographic Institute. 

Frans Willekens (Ph.D.) is a professor of mathematical demography and head 
of the Population Research Centre at the University of Groningen in the 
Netherlands. 




Anatoli Yashin (Ph.D.) is a professor of mathematical demography and head of 
the Laboratory of Advanced Statistical Methods at the Max-Planck Institute for 
Demographic Research in Germany. 




List of Figures 



2.1. Classes of determinants of health status 35 

2.2. Global burden of disease: model used in modelling mortality 

and morbidity 40 

2.3. Estimated total annual AIDS incidence in the European Union 

1981-1998 among adults and adolescents (pre-1993 case 
definition), with approximate Bayesian prediction intervals 
(1994-98) 45 

2.4. Effectivity ratios of interventions on chlamydia 45 

2.5. Basic stmcture of a model for cancer screening 50 

3.1. Lexis diagram. Historical data shown as a rectangle 

t(0) < t< t(l),x(0) < X < x(l). The cohort bom at time c lies on 
the diagonal commencing at (c,0) 60 

3.2. Lexis diagram of population data along a cohort 70 

3.3. Dutch male observed log (odds), ages 1-100, years 1850-1990 71 

3.4. Dutch male and female log(odds) for various ages, 1890-1990 

(males bold, females dashed) 73 

3.5. Plot of the fitted mortality surface of Dutch males (see Table 

3.1) 79 

4.1. Observed and extrapolated post-war log(odds) of Dutch males. 

Ages 40-94, years 1946-2030 86 

4.2. Period and cohort life expectancy from fitted and extrapolated 

mortality surfaces. Dutch males and females at ages 60 and 80 87 

4.3. Observed, fitted and extrapolated log (odds) based on 

descriptive models. Dutch males and females at ages 60 and 80 88 

4.4. Observed and predicted log (odds) based on predictive models. 

Dutch males and females at ages 60 and 80 95 

4.5. Period and cohort life expectancy from fitted and predicted 

mortality surfaces. Dutch males and females at ages 60 and 80 98 

4.6. Period life expectancy calculated from descriptive and 
predicted mortality surfaces. Dutch males and females at ages 

40, 60 and 80 99 




XXll 



4.7. Probability of survival to age x given age 40 in 1970 (1930 

birth cohort) based on descriptive and predictive models. Dutch 
males and females 99 

6.1. Old-age mortality in four countries 1950-1994 134 

6.2a. Old-age mortality by countries and decades, countries (pooled 

data from 1950-1994) 135 

6.2b. Old-age mortality by countries and decades, decades (pooled 

data from four countries) 135 

6.3. Exponential rate of change of mortality with age (three 

countries, 1950-1994) 136 

6.4. Three functions employed in modelling old-age mortality. 

Pooled data from four countries, 1950-1994. Fit interval 80- 

109 years 143 

6.5. Extrapolation of mortality beyond age 85 resulting from 
selected models. Pooled data from three countries, 1950-1994. 

Fit interval 60-84 years, weights method 2 149 

7.1. Forecast of mortality from lung cancer: age patterns for Dutch 

men 167 

7.2. Empirical and forecasted SMRs Fung cancer, Dutch men, age 

40 + 168 

7.3. Static and dynamic estimates of base parameters. Mortality 

from lung cancer, Dutch men 169 

7.4a. Empirical and forecasted age-standardised mortality rates. 

Overall period, cause-specific period forecasts 176 

7.4b. Empirical and forecasted age-standardised mortality rates. 

Overall period, overall cohort, cause-specific forecasts 177 

8.1. Basic structure of the chronic diseases model 193 

8.2a. Smoking prevalence in different scenarios 198 

8.2b. Smoking prevalence in different scenarios 198 

8.3a. Standardized lung cancer mortality in different scenarios 199 

8.3b. Standardized lung cancer mortality in different scenarios 199 

8.4a. Standardized coronary heart disease mortality in different 

scenarios 200 

8.4b. Standardized coronary heart disease mortality in different 

scenarios 200 

9.1. Life expectancy at birth in the Netherlands 207 

9.2. Sex differences in life expectancy (F-M) 207 

9.3. Change in mortality rates ratio (1993-1997) / (1988-1992) 207 

9.4. Mortality rates in 2050 (1995=100) 220 




XXlll 



9.5. Life expectancy at birth, 1998 Netherlands Population Forecasts . . . 221 

10.1. Number of projection variants 229 

10.3. Length of projection period (in years) 230 

10.5a. Increase in life expectancy - males (years) 232 

10.3b. Increase in life expectancy - females (years) 232 

10.4. Variance in life expectancy between the EU countries (years) .... 233 

10.6a. Male life expectancy for 2000 - differences between the latest 

national forecasts and those made around 1985 (years) 234 

10.5b. Female life expectancy for 2000 - differences between the latest 

national forecasts and those made around 1985 (years) 234 

10.6a. Male life expectancy - differences between national forecasts 

and UN projections, 1995-2020 (years) 235 

10.6b. Female life expectancy - differences between national forecasts 

and UN projections, 1995-2020 (years) 235 

10.7a. Male life expectancy - differences between national forecasts 

and Eurostat's basel in e scenario, 1995-2020 (years) 236 

10.7b Female life expectancy - differences between national forecasts 

and Eurostat's base li ne scenario, 1995-2020 (years) 236 

10.8. Infant mortality rate. Females 2020 (1995 = 100) 237 

10.9. Death rates at age 80-84. Females 2020 (1995 = 100) 237 

10.10a. Age-specific mortality rates - females - changes 1995-2020 238 

10.10b. Age-specific mortality rates - males - changes 1995-2020 238 

10.10c. Variance of mortality changes between EU countries, 1995- 

2020 238 




List of Tables 



1.1. Main parameterization functions for mortality 7 

2. 1 . Actual and projected mortality from lung cancer by age and 

sex, England and Wales, 1951-2025 (rates per million) 40 

2.2. Estimates of life expectancy with risk factor interventions 49 

3.1. Dutch males. Results for the fit to 77 

24 

3.2. Dutch females. Results for thefitto 5(P;t,x) = Pq 

i = l 

for (t,x) in (3.5) 78 

4.1. Observed (1990) and predicted (/> 1991 ) period life 
expectancies for Dutch males and females. Extrapolations of 
the descriptive models (3.9) and (3.10). Standard errors shown 

in brackets 89 

4.2. Cohort life expectancies of Dutch males and females predicted 
from the descriptive models (3.9) and (3.10). Standard errors 

are shown in brackets 92 

4.3. Observed (1990) and predicted (t > 1991 ) period life 



expectancies for Dutch males and females. Extrapolation of the 



predictive models (4.1) and (4.2). Standard errors si mi lar to 
those in 4. 1 96 

4.4. Cohort life expectancies for Dutch males and females obtained 
from extrapolation of the predictive models (4.1) and (4.2). 

Standard errors similar to those in 4.2 98 

6.1. Mortality models employed in the survey 140 

6.2. Goodness of fit of 11 models for old-age mortality in four 

countries, 1950-1994. Eit interval 80-109 years 144 

6.3a. Sum of squared residuals in 1 1 models for old-age mortality, 

by countries, 1950-94. Eit interval 80-104 years 147 

6.3b. Sum of squared residuals in 1 1 models for old-age mortality, 

by countries, 1950-94. Eit interval 80-104 years 148 




XXVI 



6.4a. Fit and extrapolation goodness of 14 models for old-age 

mortality, three countries, 1950-1994. Fit interval 60-84 years.... 147 
6.4b. Fit and extrapolation goodness of 14 models for old-age 

mortality, three countries, 1950-1994. Fit interval 60-84 years .... 148 



7.1. Alternative models for mortality of Dutch men 172 

7.2. Overview of trends in the Gompertz parameters used for our 

projections 172 

7.3. Life expectancy in France, Italy, the Netherlands and Norway 

in 2020 according to the period, cause-of-death and cohort 
projection approaches 175 

7.4. Observed and projected average annual rates of change in 

SMRs by forecasting approach 178 




Part 1 . Introduction 




1 . A Review of Demographic 
Forecasting Models for 
Mortality 



EwaTABEAU 



Abstract 

The goal of Chapter 1 is to describe and comment on the methods and 
approaches that have been in use or have emerged in recent years. Section 1.1 
introduces the most common classifications of forecasting models for 
mortality. Section 1.2 is devoted to a brief historical review of 

parameterisation functions. In this context, attention is paid to prediction based 
on parameterised age schedules, in particular by using time series models. 
Section 1.3 focuses on the (statistical association) models of Lee and Carter 
and Section 1.4 characterises the (log-hnear) age-period-cohort models. In 
Section 1.5 the reader can find a review of the methods used in international 
statistical practice and in Section 1.6 the importance of uncertainty in 
forecasting is addressed. Section 1.7 outlines the prospects for modelling and 
forecasting mortality as seen from the perspective of this chapter. 



1.1 I Most Common Classifications of Forecasting Models for Mortality 

A positive feature of forecasting mortality in developed countries is that 
adequate historical information is usually available, at least for aggregate 
measurements. In this case the two salient questions facing a forecaster are 
finding an accurate description of the past, and secondly, taking on 
judgemental factors in order to produce plausible forecasts. The first can be 

viewed as a technical problem essentially concerned with modelling and 
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estimating a time- and age-specific measure of mortality. Judgement comes 
into play since the naive forecast obtained by extrapolation may lead to 
nonsensical predictions. Problems of forecasting specific to demography are 
dealt with extensively in the literature, some of which is referenced in this 
volume.^ In Chapter 1 forecasting methods for mortality alone are discussed. 
Methods of forecasting mortality and health jointly are discussed by health 
forecasters (Van den Berg Jeths et al, Chapter 2 in this volume). Two lines of 
forecasting and/or making projections of mortality are included: one 

scientifically oriented and one with a practical orientation. In considering the 
practice of forecasting mortality, the procedures applied by international 
organisations, such as Eurostat, the World Bank and the United Nations, are 
discussed. National practices are beyond the scope of this chapter. The 
discussion below is intentionally concise, in l in e with the goal of this review, 
namely to describe and comment on the methods that have been used or have 
emerged in recent years. This chapter does not strive to present a practical 
guide for forecasters or a systematic discussion of the most commonly used 
statistical forecasting approaches for mortality. 

Forecasting of mortality has traditionally been a central issue among actuaries 
and demographers. In recent years, however, interest in the development of 
new models and strategies for modelling and forecasting mortality has slightly 
decreased in actuarial science and demography. Interest in this field has, on 
the other hand, grown in quantitative research on ageing, that is in biostatics, 
genetics, evolutionary biology, gerontology, and geriatrics (see also Yashin, 
Chapter 11 in this volume). New methods of modelling and forecasting 
mortality and health as a joint category has received considerable attention 
from quantitatively oriented public health scientists (Van den Berg Jeths et al, 
Chapter 2 in this volume). 

In demography, several recent publications have made significant contributions 
to the methodological aspects of forecasting of mortality (e.g. in alphabetical 
order: Benjamin and Soliman, 1993; Gomez de Leon and Texmon, 1992; 
Keyfitz, 1991; Lopez and Hakama, 1986; Murphy, 1990; Olshansky, 1988; 
Pollard, 1987 and Willekens, 1990). Pollard (1987) reviewed a variety of 
methods which have been suggested by actuaries and demographers for the 
projection of age-specific mortality rates: projection by extrapolation of 
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mortality rates (or transformations of mortality rates) at selected ages; 
projection by reference to a 'law of mortality'; projection by reference to 
model life tables; projection by reference to another 'more advanced 
population; projection by reference to an 'optimal' life table under ideal 
conditions; projection by cause of death; and combinations of these methods. 
He gave examples of the use of these methods and drew conclusions on 
respective advantages and disadvantages. A si mi lar type of review was also 
presented by others (e.g. Benjamin and Soliman, 1993). This type of 
publication serves as a practical guideline for forecasters. Regarding the choice 
of preferable methods, these authors usually state that "the answer is clearly 
related to the type, the extent, and the quality of the data available at the 
moment of projection" (Pollard, 1987). 

The classification of forecasting methods for mortality is seen differently by 
different authors. For instance, Gomez de Leon and Texmon (1992) suggest 
that cause-of-death models be distinguished from traditional methods (after 
Olshansky, 1988) and that traditional methods be divided into two categories, 
namely methods based on extrapolation and procedures based on mortality in a 
more 'advanced' or a 'natural' population. These latter methods include 
interpolation between observed mortality and 'advanced' -level mortality (target 
projections). Together, Gomez de Leon and Texmon distinguish four major 
approaches to project mortality: projection by extrapolation, projection by 
reference to a particular mortality model (Brass relational system), "target" 
projection, and projection by reference to mortality components (i.e. to causes 
of death). 

The above rather tech ni cal views can be supplemented by more substantive 
opinions. For instance, WiUekens (1990) distinguishes extrapolation methods 
and methods based on epidemiological and biomedical research. A major 
feature of the extrapolation techniques is a lack of information on the forces 
shaping the change in mortality. These methods are based only on trends 
observed in the past. They differ in the way they summarise age profiles. 
There are models referring to single ages (e.g. geometric progression equation 
applied to each age group separately) and models considering the entire age 
span (e.g. 'laws' of mortality or graduation models). An alternative method 
expresses the set of age-specific mortality rates or a transformation of the rates 
as a modification of a standard profile (relational models; Brass two-parameter 
system and its modifications; Zaba, 1979, 1993; Ewbank et al, 1983). More 
complex methods add the cohort factor to the common age stmcture, resulting 
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in three-factor age-period-cohort (APC) models. Willekens suggests that 
methods based on epidemiological and biomedical research are increasingly 
being used to improve forecasting of mortality. A preliminary step in this 
direction is projecting mortality by cause of death. However, even though 
many industrialised countries consider cause of death in mortality forecasts, 
modelling techniques applied to cause-specific data are pattern-oriented rather 
than process-oriented. 

A large body of the major methodological and practical results on the process- 
oriented approach to forecasting mortality by cause of death was proposed by 
Manton and colleagues (e.g. Manton and Stallard, 1984; Manton, 1993). They 
suggest that efforts aimed at modelling and forecasting mortality can be 
classified into empirical or extrapolation procedures and procedures based on 
theories of human mortality and ageing. Projecting mortality from biological 
theories of ageing has a long actuarial tradition which began with the 
Gompertz law of mortality (1825; Manton, 1993). The rationale behind this 
law is the proportional loss with age of man's ability to withstand 
'destmction'. A constant was added to this relation by Makeham (1860) to 
reflect mortality caused by chance factors. Manton (1993) notes that it was 
shown empirically that Gompertz tends to overpredict mortality at advanced 
ages (see also Boleslawski and Tabeau, Chapter 6). The theoretical curve more 
closely approaches the slower rate of increase in mortality at advanced ages in 
the model proposed by Perks (1932). Also in recent years, the flattening of 
mortality curves has been shown by several authors for both human and 
animal populations (e.g. Vaupel et al, 1998). Manton (1993) also finds that 
approximately 95 per cent of the age dependence represented by the Gompertz 
function in human populations could be explained by measures of chronic 
disability. Despite the long tradition of biologically motivated models of the 
age trajectory of human mortality in actuarial science, current forecasting 
procedures are often based on extrapolation and do not directly reflect 
physiological processes at the individual level or in a cohort. This makes it 
difficult to use epidemiological and biomedical evidence on health changes in 
forecasts. 

The division of forecasting models into methods based on extrapolation and on 
underlying theories should ideally be followed in reviews such as ours. The 
practice of forecasting is, however, mainly based on extrapolation from 
various descriptive models, usually combined with judgement. This implies 
that the form of the descriptive model and the importance of judgement were 
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convenient criteria that could serve as the underlying principles for the 
discussion here. The relatively simple but easy to expand parameterisation 
functions of the mortality age schedule are discussed first (Section 1.2). At the 
end of this section, prediction from the parameterised age schedules is 
reviewed. In this context, time series models of the parameters of model age 
schedules are summarised. In Section 1.3, more complex two-factor (i.e. age- 
period) models are introduced with an example of the models of Lee and 
Carter. Afterwards, we review a large group of applications of age-period- 
cohort models representing the most complex modelling approach for 
mortality, based on three factors (Section 1.4). Predictions made with the aid 
of the models discussed in Sections 1.2 to 1.4 are usually obtained with 
minimum judgement. The role of judgement is much more prominent in the 
mortality projections made by international organisations. We discuss the 
methods and assumptions underlying these projections in Section 1.6, and 
Section 1.7 addresses the importance of uncertainty in forecasting. Finally, 
Section 1.7 closes the discussion by proposing possible approaches to 
forecasting in the future. 



1.2 I Parameterisation Functions 

Parameterisation functions are often termed mortality 'laws'. They describe 
mortality age patterns in terms of mathematical functions of age. The number 
of parameters (sometimes interpretable) usually ranges from 2 to 9. Variables 
which are parameterised are age-specific death rates, probabilities or other 
measures. Parameterisation is applied to a 'closed' population, i.e. to a birth or 
synthetic (observed cross-sectionally in a year) cohort. Parameterisation 
models have been used by demographers, medical scientists and actuaries. 
Their usefulness has been limited in the past to smoothing data, eliminating 
and/or reducing errors, constmcting life tables, aiding inferences from 
incomplete data, facilitating comparisons of mortality, and forecasting 
(Keyfitz, 1982). The role of parameterisation models has recently further 
increased. Entirely new areas of applications for parameterisation have 
emerged, such as modelling disease processes (e.g. Manton and Stallard, 
1984, 1988; also Van den Berg Jeths et al, Chapter 2 in this volume), 
modelling heterogeneity, stochasticity and homeostasis in survival (e.g. 
Yashin, Chapter 11), and expanding the stracture of age-period-cohort models 
(e.g. Holford et ah, 1994; also Section 1.4 of this chapter). 
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'Laws' of mortality have developed into an established research topic since the 
first half of the 18th century (see Table 1.1). Examples of the older laws 
include the models of de Moivre (1725), Gompertz (1825), Makeham (1860), 
Opperman (1870), Thiel (1872), Wittstein (1883), Steffenson (1930), Perks 
(1932), Harper (1936) and Weibull (1939; aU after Tabeau et ah, 1994), while 
the models of Heligmen and Pollard (1980) or Rogers and Little (1994) belong 
to the more recent ones. 

Many recent functions have their origins in the old functions. Both groups 
therefore have many si mil arities. However, a division of models in Table 1.1 
cannot be based on a single criterion. Some functions are closely related to the 
Gompertz model, the so-called Gompertzian, whilst others have considerably 
different specifications, the non-Gompertzian functions. Single- and multiple- 
component functions can be distinguished on the basis of causal 'sources' of 
mortality. Whilst single-component models do not make a distinction between 
different causes of death that dominate subsequent stages of life, multi- 
component models do. Mathematical model formulation allows all functions to 
be divided into polynomials and non-polynomial functions. One group of 
models has an extended theoretical background; the other has no theoretical 
background at aU. Some parameterisation functions were developed for the 
entire age span, others only refer to selected ages. The above divisions are 
rather formal, one exception being the distinction between polynomials and 
non-polynomials, which is virtually identical to the distinction between the 
models with and without a theoretical background. 

Polynomials are popular interpolation and graduation techniques. A good 
reason for using polynomials in modelling age patterns of mortality is that 
most functions can be approximated by a polynomial to any degree of 
accuracy in the form of a Taylor power series. Therefore, the exponents of 
quadratic, cubic, and higher degree polynomials are often used to model the 
age variability of the death probability, its odds ratio, mortality rate, or life 
expectancy. 

The second group, 'non-polynomials', mainly consists of additive multi- 
component models. Recently formulated non-polynomial functions usually 
assume that different causes of death apply in infancy and childhood, at 
adulthood and in old age. The models usually include three components, each 
for a different stage of life. The components are defined in such a way that 
each of them vanishes at ages different from those for which they were 
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Table 1.1. Main parameterization functions for mortality 



Author 


Publication 


Model 




OLD NON-POLYNOMIALS 




1 

((0 -x) 


De Moivre 


1725 


p(x)= 


Gompertz 


1825 


p(x)= 


Bri 


Makeham 


1860 


uM= 


A +Bc^ 






p(x)= 


a + /X + /3c' 


Opperman 


1870 




+ b + cyfx 
Vx 


Thiele 


1872 


p(x)= 




Wittstein 


1883 


q(x) = 


m 


Steffenson 


1930 


log,oS(x)- 








e(x) = 


1 






A + Bc’' 


Perks 


1932 


p(x)= 


A + Bc' 


kc^ + I + Dc^ 


Harper 


1936 


logioSix) 




Weibull 


1939 


p(x)= 




Van der Maen 


1943 


p(x)= 


A + Bx + Cx^ +^— 
N-x 






p(x)= 


A+Bc^+ ^ 
N-x 
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POLYNOMIALS 



Unnamed 




II 


+ a, JC + 02X^ + . . . + ) 


(in Keyfitz, 1982) 




y{x) = 


■Qix), ^ ore{x) 

P(x) 


RECENT NON-POLYNOMIALS 






BriUinga" 


1960 


IJ(x)= 


(bi-x)‘ 


Beard 


1961 


IJ(X)= 


1 + De'^ 


Petrioli 


1981 


s(x) = 


1 

h I 

x“ (a - X y — 1- / 

k 


Martinelle 


1987 


^J(X)= 


1 + De'^ 


British actuaries 


1980s 


q(x) _ 

P(x) 


A - Hx -V be 


(in Keyfitz, 1982) 








RECENT NON-POLYNOMIAL 






Siller 


1979 


IJ(X)= 




Heligman-Pollard 


1980 


q(x) 

p(x) 


= + £)g-^ ■ *'^1 ’ + 






q(x) = 


j(x + Bf + j^^E(lnx- InF)- ^ GH 

1 + GH" 






q(x) = 


Ax + Bf + T>.-E(lnx - inPf + GH'‘ 

l^KGH^ 
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Brooks et al. 1980 



Rogers and Planck 1983 



Kostaki 1992 



Rogers and Little 1993 



q(x)= GIT ^ 

I + GIF 

IJ(X)= \^.,(x) + ^ ^(x) + )X 5 (x) 

^ ^ I Qo for x = 0 

= ) 

I Qf for x> 0 

(In X -to XA 
8 2 

a(^) ^ Q for x> 0 

X 

O e’‘^ 

^ for x> 0 

1 + 

q(x)= + 

q{x) _ f +GH\x<F 

p(x) [ A(-^Bf + j^^-E,G^og^F)-^QH\x>F 

y(x)= aQ+m^{x)+m 2 {x) + m 2 {x) + m^(x) 



Where: 

m,(A:) = a, exp( -a,x) 

m^ix) = flj exp(-a2(x - Pj) “ expf-kjCx - p^))) 
mj(x) = a, exp(-aj(x - p,) - exp(-Xj(x - pj))) 
m ,{x) = a, exp( a ,x) 

y{x) = q(x), p(x) 

p(x) 



RECENT NON-POLYNOMIAL, PARTIAL AGE FUNCTIONS 

ages 0 to 15 years: 

y(x)= ,4, +S, Inx 

ages 15 to 35 years: 
y{x)= A^ + B^x 
ages 35 to 60 years: 

y(x)= A^ + B^c\ 



Hartmann 



1981 
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where Y(x) - logit of l(x) 

Mode and Busby 1982 ages 0 to 10 years: 

IJo(x)= 

ages 10 to 30 years: 
lJi(x)= a, -p/x-yj" 
ages 30 and over: 
lJ2(x) a, + p,y,e''.-" 



Note: In this table the standard demographic notation is used. The survival function s is such 

that for continuous x, s(x) is the probability of survival from birth until age x. The life 
table symbol l(x) is the empirical equivalent of the function s(x). The probability that a 
person x will survive to age x+1 is p(x) = s(x+l)/s(x) and the probability that a 
person aged x dies before reaching age x+ 1 is q(x) = l-p(x). The force of mortality, 
also known as mortality intensity p(x) = -(ds(x)/dx)/s(x). The symbol m(x) is used for 
its empirical equivalent, the mortality rate. The symbol e(x) denotes life expectancy at 
age x, CO is the highest attainable age, and the remaining symbols are model 
parameters. 



basically specified. Therefore, the sum of the components describes the 
mortality pattern across the entire age span. Parameters of these models are 
estimated in one step. 

The assumption underlying recently formulated non-polynomials was 
recognised and utilised as early as in the 19* century (see the models of 
Mackeham, Opperman and Thiel in Table 1.1). Thiel (1872) proposed 
particularly accurate formulations for the three components of his model. It is 
therefore not surprising that many contemporary authors followed his 
proposals or arrived independently at similar ideas in their work. 

In 1979, Siller proposed a three-component competing risks model and in 
1980 Heligman and Pollard published their well-known article "The age 
pattern of mortality". Heligman and Pollard proposed one main model and 
three modifications of the original function. Since then, Thiel's formulation 
has been adopted by scientists more often. In 1980, Brooks et al. published a 
slightly modified version of the Heligman-Pollard function with a simplified 
child mortality term. In 1982, Mode and Busby adopted Thiel's original 
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formulation, but selected a parabolic function for the second component, 
whereas in 1984 Rogers and Planck selected a double exponential curve for 
this term. A further adaptation of Thiel's formula, and at the same time a 
further modification of the HeUgman-Pollard model, was proposed by Kostaki 
in 1992. The first and t h ir d terms on the right side of Kostaki's function are 
the same as in the original HP formulation, while in the middle term the 
parameter E, which indicates the spread of the accident hump, has been 
replaced by two parameters Ej and E 2 , related to the spread of the accident 
hump to the left and right of its peak, respectively. The major goal of all these 
attempts was to improve the fit, and in most cases this was achieved. 
However, as noted by Kostaki: "Erom a strictly practical point of view, the 
success of the best fit taking an additional parameter in the formula at the cost 
of parsimony might come to be regarded as a task whose applicability is not 
always justified or necessary." 

In 1994, Rogers and Little summarised the above-mentioned attempts by 
publishing their multiexponential model schedule that is general enough to 
capture the varying shapes of the age pattern of mortality as well as of other 
demographic processes. The complete form of the model includes five 
components and a total of 13 parameters and "is a general purpose function 
that in its various reduced forms can represent simple unimodal curves, u- 
shaped curves, or more complicated bimodal curves with exponentially 
increasing or decreasing components" (Rogers and Little, 1994). 



1.2.1. Predicting Mortality by Using Parameterisation Models 

Prediction of mortality from parameterisation models is usually based on time 
series models (Box and Jenkins, 1976). The Box and Jenkins approach has 
been popular as a tool for the modelling of time trends observed in data on any 
demographic process. Population, fertility and mortality have all been 
predicted using time series methods. The popularity of these methods is related 
to their great flexibility, and to the fact that they allow for the incorporation of 
effects of (projected or hypothesised) changes in behavioural and 
socioeconomic variables in forecasts and permit the constmction of confidence 
intervals. In the past, time series methods were often applied to model 
aggregate variables such as total population, total fertility or total mortality 
(brief reviews in: McNown and Rogers, 1989; Thompson et al, 1989), thus 
unavoidably ignoring changes in the age patterns of the predicted phenomena. 
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An example of this type can be found in Lee (1974; in McNown and Rogers, 
1989) and Carter and Lee (1986; ibid.) who applied time series methods to 
forecast (US) fertility. Their predictions focused on the fertility level and 
assumed a constant age schedule of fertility over time, yielding non-realistic 
forecasts by age. Thompson et al. (1989; ibid.) proposed a more accurate 
approach (also for the US). They parameterised the age schedule of fertility 
using the Gamma function in the first step, and in the second step they 
predicted the parameters of the Gamma model estimated for a series of (64) 
subsequent annual age schedules. The parameters were predicted using 
(univariate and multivariate) time series methods. 

For mortality, several recent studies exist (e.g. McNown and Rogers, 1989; 
Rogers and Gard, 1991 and McNown and Rogers, 1992) in which the 
methods of Box and Jenkins (1976) have been used in the same way as in the 
parameterisation-time series approach proposed for fertility by Thompson et 
al. (1989). In mortality studies, (univariate) autoregressive integrated moving 
average (ARIMA) models have been particularly popular in forecasting the 
parameters of model age schedules. McNown and Rogers (1989) fitted an 
eight-parameter Heligman-Pollard (1980) curve (see Table 1.1) to the age 
schedule of US mortality for the years 1900-1985, obtaining a sequence of 
annual estimates of the parameters. After differencing to achieve stationarity, 
the eight univariate time series were estimated by ARIMA models to yield 
fitted Heligman-Pollard curves with time varying parameters. Extrapolation 
then provided the basis for forecasts. McNown and Rogers' procedure 
produced "medium-range forecasts of mortality that meet the standard tests of 
accuracy in forecast evaluation and that are sensible when evaluated against the 
comparable forecasts produced by the Social Security Administration" (ibid.). 
Later, the same approach was also applied to (perhaps too short series (1960- 
1985) of) cause-specific mortality in the United States (Rogers and Gard, 1991 
and McNown and Rogers, 1992). Further, multivariate time series models 
were applied by Hagnell (1991) to Swedish demographic data. Finally, Bell 
and Monsel (1991) showed that the dimensionality of the problem can be 
reduced by using the principal components. 

In the context of predictions based on the parameterisation approach, it is 
worth noting that the annual estimates of model parameters are rather unstable. 
This major drawback can be overcome by jointly modelling subsequent annual 
age schedules in the form of two-dimensional mortality surfaces. Solutions of 
this type are also discussed in this volume by Heathcote and Higgins, Chapters 
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3 and 4, and Tabeau et ah, Chapter 7. Mortality surfaces are modelled by 
parameterisation functions with time-dependent parameters that are estimated 
using arrays of age- and time-specific measures of mortality (i.e. death 
probabilities, odds ratios, or mortality rates) as input. Making projections is 
simple: it is done by giving values to two predictors, time and age. This yields 
sets of complete model age profiles for the years to come. 



1.3 I Statistical Association Models: the Lee and Carter Approach 

Lee and Carter's (L-C) model combines a parsimonious demographic model 
with statistical time series analysis (Lee and Carter, 1992; Carter and Lee, 
1992; Lee, Carter and Tuljapurkar, 1995). The model has been applied to 
forecast US mortality. It provides probabilistic confidence intervals for its 
forecasts. 

\n m^, = a^ + b , + s (1.1) 

The model is specified for the logarithmic transformation of the age-(x) and 
period-(t) specific mortality rate, depending on three parameters: k, - time- 
varying mortality level index, - the first age-specific constant interpreted as 
the general shape across age of the mortality schedule, - the second age- 
specific constant showing which rates decline more rapidly and which more 
slowly in response to changes in k. Note that, as in formulation (1.1), another 
manner of writing Model 1.1 is the following: 

m = A, exp( (1.2) 

with equal exp(a^). The model is not fully determined (the a-parameter can 
only be identified up to an additive constant, the Z? - up to a multiplicative 
constant and the k up to a linear transformation) and requires additional 
constraints on the parameters to be successfully estimated. Lee and Carter 
applied singular value decomposition (SVD) in the model estimation. Using 
Model 1.1 a family of life tables can be generated as a function of k. As k 
varies on a chosen scale (for instance between 0 and 1; k=0 for the first 
available life table and k= 1 for the second) a family of life tables will be 
generated that includes the two as its basis. For k within the scale interval 
(i.e. <0,1 >), the model geometrically interpolates between the two life tables; 
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for k outside the scale interval, it extrapolates from the two tables. The 
usefulness of this model is, therefore, at least two-fold: it informs us about 
both the dynamic and the static stmcture of the mortality process (i.e. the level 
of mortality over time and the age profile as estimated across the entire time 
series), and it can be used to generate missing age profiles of mortality. 

Recently, Lee et al. (1995) discussed extensions of the basic method, i.e. 
approaches which impose more stmcture on mortality rates and reduce the 
number of parameters to be estimated or time series to be forecasted. For 
example, in the matrix of age-specific rates, the female rates can be stacked on 
top of the male rates, and the resulting matrix will contain the same number of 
periods, but twice the number of age groups. If this matrix is treated as if it 
referred to a single population, then a single time series of mortality index is 
estimated for both male and female rates. Another simplifying assumption is 
that men and women are permitted to have different and therefore different 
levels and differently shaped age schedules, w hi le the fy are constrained to be 
the same. With a single mortality index, k, this would constrain male and 
female age-specific death rates to decline at the same rates, but to retain 
different levels and shapes. 

Note that the Lee and Carter model can be viewed as a member of the class of 
log-linear models, namely, a statistical association model for a two-way cross- 
classification table with age and time being the two classification criteria of the 
risk of death. This remark comes to mind when using the perspective of 
Goodman's work on the log-linear modelling, in particular following his 
analysis of the statistical association in contingency tables with ordered 
categories (Goodman 1979, 1981, and 1985; more recently also Goodman, 
1991). Given the independence model of the row (i) and column (j) variables, 
written as: P^j , where is the probability that an observation will fall 

into the ith row and the jth column of the table, and a,’ and Py are two (non- 
negative) constants representing the usual row and column effects, the non- 
independence (saturated) model is the following: fyy=a,PyX-. The Xjj 

parameters are interaction effects between the row and column categories. 
Non-independence models are generalisations of the independence pattern and 
are of interest when the latter model must be rejected. Different perspectives 
can be taken when formulating the non-independence models. In the 
(unweighted) association models proposed by Goodman, the interaction term, 
X(j, is specified as a function of the row (li,„,) and the column (Vy„) scores 
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multiplied by a so-called intrinsic association parameter and the model 

has the following general form: = a, (3,- exp A,, y (Goodman 1979, 1981 and 

1985). More specifically, it can be written as: 



M 

i^.=a,.pycxp 2](t)„^,v„Vy, 

Vm=l 



(1.3) 



The number of components, M, is defined as M - min(I,J)-l and equals 1 for a 
2x2 table, 2 for a 3x3 table and so on. When Equation 1.3 is compared with 
Equation 1.2, we see that the Eee and Carter model includes the interaction 
effect defined as a product of age (b^) and time (kj) effects, whereas the 
column variable main effects (Py) equal 1 for all years included (t). 

As a statistical association model, the Eee and Carter equation belongs to the 
class of generalised linear models (GEIM; McCullagh and Nelder, 1989) and 
can be estimated by the likelihood maximisation (EM) procedure for Poisson- 
distributed errors. In line with the GEIM approach, the EM estimation of the 
Eee-Carter model was done by Wilmoth in 1993. The number of deaths by 
age and time were taken as the data to fit, while both the age distribution of the 
population and the algebraic expressions for the death rates in terms of a^, b^, 
and kf, were incorporated in the equation to fit. The EM estimation has been 
considered to be superior to the SVD procedure (Eee et al, 1995). 

Eorecasting of mortality on the basis of the E-C model is done using the time 
index k^. Euture trends in k^ are usually extrapolated from a (Box-Jenkins) time 
series model for such as a random walk model with drift. As for 
parameterisation functions, in this case, too, the use of time series in 
prediction lies in the general area of statistical and econometric modelling. In 
the same spirit, albeit from a different angle, Heathcote and Higgins in 
Chapter 3 and Tabeau et al. in Chapter 7 of this volume use a bivariate 
regression model to estimate the time-age surfaces defined by a time- and age- 
specific measure of mortality. Under assumptions appropriate to modelling 
mortality, Heathcote and Higgins show that the surface can be estimated by 
iterative re- weighted least squares. Judgemental factors can enter the 
forecasting process through possible alteration of the parameters of the fitted 
model. 
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1.4 I Age-Period-Cohort Models 

Age-period-cohort (APC) models have been developed to separate period and 
cohort effects in time series of mortality schedules. An APC model can be 
classified as symptomatic: period effects approximate contemporary factors, 
such as the health status of the population, and cohort effects approximate 
historical factors. The APC parameters describe the trajectories of calendar 
time and cohort effects given the presence of the parameters describing the age 
pattern of mortality. A linear APC model is a log-linear model and can be 
written as a model for log-rates in which the effects of age, period and cohort 
combine additively (Equation 1.4). (In the model for rates, the effects combine 
multiplicatively.) 

logA...^ =p+a,. + p^. +y^ (1.4) 

where fj, is an intercept, a,- (i=l,...,I), Py (j = l,...,J), and (k=lv.K) are 
the log-linear effects due to age, period and cohort, respectively. The usual 
constraints imposed on the parameters imply that: 

I j k 

The l in ear dependence among the age, period and cohort variables (c = p-a) is 
also manifested in the indices i, j, and k (k=j-i-i-I), and the design matrix of 
the linear model is not full-rank. It is therefore impossible to obtain a unique 
set of a,-, Py, and y* parameters. Note that each of the APC parameters can be 
expressed as the sum of two components: the so-called drift, that can be seen 
as the slope of the overall li near trend, and a departure from the drift. In the 
linear APC model (1.4) only the departure from the drift can be estimated 
while the drift itself cannot be estimated (e.g. Clayton and Schifflers, 1987b). 
The APC model, like other log-linear models, can be estimated using Poisson 
maximum likel ih ood (or weighted least-squares) methods. 

APC modelling has been popular in epidemiology with many applications in 
mortality by cause of death and disease incidence (e.g. Clayton and Schifflers, 
1987a, h). In demography there are some applications based on historical 
series of aU-cause and cause-specific mortality data (e.g. Caselh, 1990; Caselh 
and Capocaccia, 1989; Wilmoth, Vallin and Caselli, 1990 and Willekens and 
Scherbov, 1991). Some of these are referred to below. 
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The fundamental problem of APC models is the APC identification problem 
which is related to the fact that the three variables, age, period and cohort, are 
linearly dependent. In the context of mortality studies, suppose m(a,p) is the 
age-(a) and time-(p) specific mortality measure (e.g. death rate). The 
substitution of c=p-a gives a change from rectangular age-period co-ordinates 
(a,p) to diagonal age-cohort co-ordinates (a,c) of the Lexis plane. Hence, an 
age- and cohort-specific death rate m(a, p-a) can be obtained. In this sense 
there is no identification problem. An APC model is a theoretical constmct 
that makes a clear conceptual distinction between the cohort and period 
dimensions of the death rates. Empirically, there are difficulties in APC 
modelling in estimating the two types of effects. Statistically speaking, for the 
linear APC model there is no unique set of parameters that result in an optimal 
fit to the observed data. In fact, there are infinitely many. In order to identify a 
solution, some constraints must be used for the parameters and the choice of 
constraints always remains arbitrary. In turn the existence of alternative 
constraints implies problems in the display and interpretation of the estimates 
of the model parameters. But the infinite number of possible solutions have 
something in common and this, and only this, can be interpreted. All in all, 
the APC identification problem is a scientific one in which data do not 
discriminate between different models or explanations. Advancing statistical 
methodology does not seem to be the optimal approach to solving the problem 
(Clayton and Schifflers, 1987b). 

One of the first scientifically justified solutions to the identification problem 
was proposed by Moolgavkar et al. (1979) (in: Clayton and Schifflers, 
1987b). In their model, the age effects were allowed to vary over calendar 
periods such that the age curve during one period was a fixed multiple of the 
corresponding curve during another period. These multiples were an extra set 
of parameters over and above those required by the age-period-cohort model. 
Moolgavkar et al.'s model was however, difficult to interpret and depended 
upon a lack of fit of the basic age-period-cohort model. When the basic APC 
model did fit the data well, then the extended model degenerated to the basic 
form and the identification problem reappeared. 

From a scientific point of view, the identification problem disappears if one 
can assume a precise mathematical curve for the age pattern of mortality, 
provided that its form does not contain a log-linear component. The choice of 
mathematical function should be based on biological evidence. Otherwise, the 
modelling process, even if confirmed by a good fit, is limited to a still 
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arbitrary 'repartition' of drift. In the 1980s, Boyle et al. (1982) and Boyle and 
Robertson (1987) applied the re-specification of the age component in 
modelling cancer trends. More recently, Holford et al. (1994) proposed an 
ARC model of cancer incidence with a multistage carcinogenesis function for 
the age variable (see the a(i) component in Equation 1.5; (|)q, (j);, and ^2 
model parameters, i is the age index). Holford et al. also experimented with 
other functions of age. 

log?ip =p+a(0+Py+7i 

(1.5) 

a(i) =(j)o log(i) + log(l -(j)20 

Caselli and Capocaccia (1989), too, approached the identification problem on 
the ground of scientific justification. They replaced the usual cohort effects by 
age-cohort interaction terms, each defined as the product of an extra age 
parameter, 5,. , and an explanatory variable (Equation 1.6). The terms 
were meant to measure cohort influences and were equal to the cohort death 
probability in infancy (g^ =qg(A:)) or alternatively during the first 15 years of 
life (gj =15 (\^(k)). The modified model did not need any formal constraints 
for the identification. Wilmoth (1997) classified this approach as an example 
of the direct measurement of the factors which are only proximately 
summarised by the age, period and cohort coefficients of the basic APC 
models. 



log 



(y-clijiQk) 



= fi+Ct/ + Py +5,g* 



( 1 . 6 ) 



Wilmoth (1990) proposed a modification of the basic APC model which can 
also be viewed as a scientifically justified reflection of reality (Equation 1.7). 
Similarly to Clayton and Schifflers, his proposition of approaching the APC 
identification problem sought "findings that are invariant to the choice of 
identifying constraints" or that "acknowledge the fundamental arbitrariness of 
these constraints" (Wilmoth, 1997, p. 201). The application shown by 
Wilmoth was an example of the second solution (i.e. arbitrariness of 
constraints). He introduced age-period interaction terms y,6y into the linear 

APC model and at the same time constrained the cohort parameters 9^ to 
express the deviations from the overall pattern of change by age and period. 
The number of interaction terms p depended on the improvement in the fit 
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brought by each subsequent term. Wilmoth justified the use of the age-period 
interactions by the fact that "the pace of change in demographic rates often 
differs across age. For example, mortality rates have typically fallen much 
more rapidly at younger than older ages". This component "cannot be 
expressed well in a model with no age-period (or age-cohort) interaction". 
However, "the introduction of interaction terms by no means resolves the 
identification problem in the standard APC model. Their use is consistent 
however, with the philosophy that the proper way to include all three sets of 
parameters in the model is to focus the description on two dimensions and to 
treat the third dimension as a sort of residual" (ibid., p. 195). 

log - = p + a,. + +9* (1-7) 

( 1 _^) 

2 

Note that the APC identification problem is caused by the existing linear 
dependency among the age, period and cohort variables when the 
measurement is thought to be continuous in time (e.g. Osmond and Gardner, 
1989). The empirical rates in APC modelling are most commonly age- and 
period-specific (a two-way cross-classification), whilst cohorts are derived 
from diagonals of the age- and period-specific arrays of rates. These data do 
not differ conceptually from the continuous-time measurement. However, 
sometimes deaths are aggregated due to another scheme: by age, period and 
cohort (a three-way cross-classification also called 'double' or 'non- 
overlapping cohorts'). In this case l in ear dependency among the three 
variables is destroyed (ibid.). In the late 1980s, this observation led several 
authors to use the non-overlapping cohorts in resolving the identification 
problem. This approach can be considered to be purely statistical (e.g. 
Willekens and Baydar, 1986; Robertson and Boyle, 1986 and Boyle and 
Robertson, 1987). 

In the non-overlapping cohort approach the APC model remains basically the 
same as given in Equation 1.4. However, the relationship among the age, 
period and cohort variables, c=p-a, no longer holds true. The difference p-a is 
linked to two cohorts cl and c2 instead of the cohort c. Cl is considered 
'younger' (lower triangle deaths on the Lexis diagram) and c2 'older' (upper 
triangle). Both cl and c2 are associated with one particular age group and one 
particular calendar period (i.e. one age-period square on the Lexis diagram). It 
is clear that one can arrive at a unique estimate for the linear model 1.4 using 
the non-overlapping cohort data. 
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It should be noted however that this is made possible by some non-explicit 
assumptions underlying model formulation. As suggested by Osmond and 
Gardner (1989), non-overlapping cohort model identification is obtained by 
assuming the identity of respective effects from two separate APC models, one 
specified for the lower-triangle data stmcture and one for the upper-triangle 
data. Conceptually, the two triangle data stracture models are the only tme 
APC specifications and the non-overlapping cohort model is a simplification 
that is not always justifiable, in particular if longer age/time intervals are used 
in the analysis. Death rates for many diseases increase rapidly with age, so that 
constraints for the age effects in the triangles may not be valid. If, however, 
single-year data are used, this critique is in fact no longer justified. As a result, 
the non-overlapping cohorts APC model has been applied frequently (e.g. 
Willekens and Baydar, 1986; Robertson and Boyle, 1986; Boyle and 
Robertson, 1987 and Vermont, 1990). 

The forecasting based on APC models seems to be easily achievable. 
However, although the age and cohort effects can be assumed to be fixed for 
the future as estimated from the APC model, it is not possible to use the model 
for forecasting with the assumption of no future period effects. Thus 
forecasting is not possible without sufficient knowledge of the epidemiology of 
a given disease and of anticipated trends in the population exposure to major 
etiological factors in focus. In the case of all-cause mortality, making 
assumptions for future period effects is a complex task. 



1.5 I Mortality Forecasts/Projections/Scenarios in International Agency 
Practice 

Typically, international organisations allow considerable involvement of expert 
judgement in forecasting mortality. It is also recognised and widely accepted 
that international organisations prepare projections and/or scenarios and leave 
the preparation of the most likely variants, i.e. the forecast, to the countries 
themselves. International projections of mortality are usually done in terms of 
changes anticipated in life expectancy rather than in death rates or 
probabilities. Usually time series of data on overall (i.e. all-cause) mortality 
serve as the input for projections. Interpolation is the most common method in 
international projections. The main issues relating to the procedures of these 
organisations are therefore target life expectancies and target years. The next 
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section consists of comments on current practice at the World Bank, the 
United Nations and Eurostat. 

United Nations 

The United Nations regularly publishes world population projections. Firstly 
mortality is projected for use in these projections, by assuming future changes 
in life expectancy at birth by sex, and secondly by calculating (by 
interpolation) age- and sex-specific survival ratios that are consistent with both 
the assumed life expectancy and the current national age-sex patterns of 
mortality. 

In 1994, life expectancy was generally projected to rise continuously (UN, 
1994). Three working models were developed: a high, middle and low rise in 
life expectancy. One of these models was used for each country. After 2025, 
life expectancy at birth is assumed to rise according to the middle mortality 
schedule for all countries. Although these models exhibit different paces of 
future survival, they all assume that improvement slows as life expectancy 
rises. The highest life expectancy at birth allowed in these models is 87.5 
years for females and 82.5 years for males. The most often used middle model 
assumes that male life expectancy at birth will increase by 2.5 years every five 
years until it reaches 60 years and then that the five-year gain is gradually 
reduced to 0.4 years at a life expectancy at birth of 77.5 years and remains at 
0.4 years per five years thereafter. Female life expectancy at birth is assumed 
to increase by 2.5 years every five years until it reaches 65 years; from then 
on the five-year gain is gradually reduced to 0.4 years at a life expectancy of 
82.5 and thereafter. Having extrapolated life expectancies at birth, the UN 
experts used standard life tables for working out the age distribution for the 
various life expectancy levels. However, with the highest (projected) life 
expectancies there were no standard reference tables, and the values from the 
standard reference tables had to be extrapolated by age. 

As noted by Mesle (1992), the procedure described above has a strong 
leveUing-out effect. The rate of progress in life expectancy at birth dec lin es as 
its level increases, thus inevitably reducing the gaps between countries and 
sexes. Swedish, Italian and French females are good examples of this. Another 
consequence of the procedure used, is that males benefited more than females, 
thus further reducing excess male mortality. The limit values for life 
expectancy at birth applied in the 1994 Revision are relatively low. They have 
not been changed since 1989. The most recent edition of the World Population 
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Prospects (UN, 1999) compensates for this disadvantage and extends life 
expectancy calculations up to the age of 100 with higher upper targets for 
average life duration in aU countries. 

World Bank 

The nature of the World Bank's projections changed in the 1992-93 Edition of 
the World Population Projections (Bos et al, 1992). The procedures used to 
project mortality were developed as usual from analyses of trends in life 
expectancy and infant mortality in available national data. But the 1992-93 
projections utilised extended life tables (Coale and Guo, 1989) with the 
maximum male and female life expectancy assumed to be 83.3 and 90 years 
respectively. These maxima were revised upwards in 1991 because previously 
used maxima were barely above the levels currently estimated for the lowest 
mortality countries. Due to the evidence of possible effects on life expectancy 
of controlling major risk factors, a higher maximum life expectancy was 
considered appropriate. 

Life expectancy is projected using a logistic function of the rate of change in 
mortality over time. The function is set to rise most rapidly from a (life 
expectancy) level of about 50 years and increasingly slowly at higher levels. 
The rate of change for the logistic function is allowed to vary across countries 
and for a given country over time. It is estimated outside the logistic function, 
per five-year block. A different speed is assumed for different five-year 
blocks. The estimation involves autoregression of the rate of mortality change 
and of socioeconomic factor variables in countries (e.g. the female secondary 
enrolment ratio or percentage of the urban population). 

Eurostat 

Lopez and Cruijsen (1991) produced their scenarios using epidemiological 
knowledge about trends in mortality by main causes of death and related 
changes in prevalence of risk factors and trends in health care. Certain logistic 
'ceilings' were used to reflect the fact that equivalent proportional declines in 
death rates are evidently more difficult to achieve at lower levels of mortality. 
In these scenarios, a moderate increase in future life expectancy is assumed. 

In 1996, the European Commission commissioned the Population Research 
Centre of Groningen University in co-operation with Eurostat to produce a set 
of mortality scenarios (Eding, Willekens and Cruijsen, 1996). These scenarios 
were not based on any particular maximum values of life expectancy at birth. 
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but it was assumed that recently observed trends in life expectancy would 
continue in the future. Target (2050) values for life expectancy at birth were 
obtained as educated guesses from the (most recently observed) age-, sex- and 
country-specific rates. (Hermite) interpolation was used to produce mortality 
rates between the starting (1994) and the ending (2050) values. 

The end values of the medium variant were arrived at using assumptions about 
reductions in the age-specific death rates in the scenario horizon. The rates of 
reduction were different over age. For the first two years of life, mortality 
rates were reduced by 30 per cent and 40 per cent respectively. Ages 2 to 71 
were reduced by 50 per cent. From age 72 until age 91 -i- , the reduction factor 
decreased slowly from 49 per cent for age 72 to 30 per cent for age 91 -i- . For 
the 15 European Union countries, life expectancy at birth in the medium 
variant was set to be approximately 78.9 years for men by 2050 (on average 
73.4 in 1994) and 84.1 years for women (79.7 in 1994). The increase for men 
was 5.5 years and for women 4.4 years. The low variant assumed that all age- 
specific mortality rates in 2050 would be at 130 per cent of the end values in 
the medium variant (30 per cent higher than the medium variant). For the high 
variant the initial age-specific mortality rates were set at 70 per cent of those in 
the medium variant (30 per cent lower than in the medium variant). 

Statistics Netherlands (De Beer and De Jong, 1996) prepared the 1996 
population scenarios for the countries of the European Economic Area. The 
mortality assumptions of these scenarios were based on an analysis of age- 
specific mortality rates and life expectancy at birth by gender, similar to Eding 
et al.’s scenarios. Quantitative assumptions were made regarding the rate of 
increase in life expectancy. In addition, assumptions were made regarding the 
change in the age pattern of mortality rates. 

In the baseline scenario, men's life expectancy at birth is assumed to increase 
by an average of five years (eight years in the high scenario) between 2000 
and 2050. The (average) increase for women is four years (six years in the 
high scenario). In the low scenario, only a very limi ted increase in life 
expectancy is assumed for both men and women. For females in the baseline 
variant, the majority of countries have slightly lower levels of life expectancy 
than those produced by the UN. A few examples are: Finland, France, 
Germany, Netherlands and Spain. For males, the situation is similar. 
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1.6 I Forecast Errors 

Forecasting mortality usually serves practical needs. In many countries all 
over the world, mortality forecasts are used to create and modify old-age or 
disability insurance systems and other social security programmes. All these 
programmes require uncertainty of mortality prediction to be expressed 
stochastically, that is, for instance, as probabilistic confidence intervals 
attached to the variants produced. In practice, although many of the forecasts 
and projections usually include alternative variants of future developments in 
mortality, no probabilistic interpretation of the variants is made. T hi s 
limitation of the current predictions of mortality is one of the most urgent 
issues in the forecasting of mortality. 

The stochastic propagation of forecasting errors for mortality has its origins in 
the broader context of stochasticity of national population forecasting. In this 
regard, Alho and Spencer (1990a, 1990b) and Alho (1992) have made 
important contributions to mortality in national forecasts. Their results build on 
the works on general population forecasting published by Lee (1974), Lee and 
Carter (1992) Stoto (1983), Cohen (1986), Keihnan (1990), Alho and Spencer 
(1985, 1991) and Lee and Tuljapurkar (1994; quoted after Alho and Spencer, 
1997)). 

The uncertainty of demographic forecasts can be assessed by calculating ex 
post and/or ex ante forecast errors. Ex post errors are based on the analysis of 
past forecasts as compared with their known realisations. In turn, ex ante 
errors do not require the re ali sations of a process to materiahse and represent 
the expected accuracy of the forecast being made. The two types of errors are 
in no way competitive, but complementary. A stochastic interpretation can be 
attached to the forecasts being evaluated, irrespective of the type of forecast 
error used in uncertainty assessment. 

Keilman presented a notable study of uncertainty in 1990, in the field of Dutch 
national population forecasts. He described the sources and measurement of 
uncertainty and explored the patterns in the ex post forecast errors for all 
components of the population change (mortality, fertility and migration) in the 
Netherlands. He also proposed methods to estimate uncertainty in future 
population forecasts and gave recommendations about how to enhance the 
accuracy of future population forecasts. 
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The work of Alho and Spencer (1990a, 1990b) and Alho (1992) in turn 
focuses on ex ante errors. T hi s tradition is actually closer to the statistical 
modelling approaches discussed in this chapter. Alho and Spencer's proposed 
classification of sources of forecast errors in 1985 is worth noting in this 
context. There are four major forces responsible for the errors in forecasts: 
model misspecification, errors in parameter estimation, errors in expert 
judgement (and in its importance in forecasting) or in beliefs about model 
parameters, and random variation. Alho and Spencer developed strategies to 
deal with these sources and proposed methods for ex ante error assessment in 
national forecasting of mortality. Manton (1993) elaborated these strategies in 
the broader context of health status forecasting in the population. In brief, the 
first source (model misspecification) can be for instance evaluated by 
simulating scenarios representing different specifications of the model 
structure. Multiple simulations generate a distribution of outcomes for a 
systematically determined set of scenarios. The second source of uncertainty 
involves estimates of parameter variability. If all parameters are generated 
from one data set, error depends on the probability model used in estimation. 
If parameters are estimated from two (or more) sources, error involves 
dependency of parameters estimated from independent domains. Sampling 
theory must be examined in order to know what to assume about the 
unobserved cross-data sets' covariances of parameter estimates. The last 
source of variability is uncertainty due to random variations in future rates. 
This can be due to heterogeneity of individual risks or perturbations from 
external factors. Both factors should be represented in the forecasting models. 



1.7 I Prospects for Modelling and Forecasting of Mortality 

In the early 1990s, Willekens (1990) evaluated the current state of 
demographic forecasting and research needs. The general trend is towards an 
understanding of the underlying biological and behavioural casual mecha ni sms 
operating at an individual level rather than an unproved extrapolation of past 
trends. Forecasting should go from pattern-oriented to process-oriented 
forecasts. Demographic, biomedical and/or epidemiological theories should be 
starting points for such forecasts. Knowledge obtained in cross-sectional static 
and comparative studies will be useful in formulating these theories. He 
concluded that "the ultimate goal of forecasting-oriented demographic research 
should be (...) rooted in an understanding of the casual process". This is 
because of the uncertainty, impossible to reduce without a knowledge of casual 
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mechanisms. The problem of predictabihty of demographic processes seems to 
be one of the most important issues in demographic modelling. Demographic 
processes are increasingly dependent on decisions made by individuals which 
are practically unpredictable. One therefore has to accept increasing 
uncertainty in demographic processes. This acceptance implies that strategies 
are needed to deal with uncertainty. No single approach can produce error-free 
forecasts. Willekens concluded that we should combine the results of several 
approaches to minimi se the risks of missing the targets. 

The line of causal modelling of mortality referred to above was followed 
mainly by Manton's group. Some details of these developments are discussed 
by Van den Berg Jeths et al. in Chapter 2. Based on the experience gathered 
from the large number of studies they carried out, Manton (1993) stressed that 
forecasting mortality (and health) should be based on models of human ageing 
which in turn should rely on (a proper understanding of) physiological 
processes involved. The advances necessary to improve forecasting methods 
for mortality (and health) should include using a broad range of data types, 
appropriately combining data from multiple sources and improving the 
biological realism of forecasts. In particular, the models should be based on 
stochastic processes that explicitly describe individual ageing and mortality. In 
Chapter 11 Yashin discusses examples of such models, proposed on the basis 
of heterogeneity, stochasticity and homeostatic forces in ageing. There is at 
present little progress in methodological advances in these models and t his 
hampers their practical use. Also the data needed for the models are 
intrinsically limited. Because of the models and data available, mainly sketchy 
descriptions of the underlying processes can be used in population-level 
forecasting. Thus, forecasting models can only approximate the individual 
ageing processes. The degree of approximation must be reflected in the 
uncertainty of forecasts. 

The theory-based predictive models provide a contrast to extrapolation 
methods which are conceptually simpler tools in forecasting mortality. 
However, they suffer from the fact that it is increasingly difficult to formulate 
assumptions for future mortality developments exclusively on the basis of past 
trends in mortality. The extrapolation methods discussed here and in recent 
use, i.e. age-period-cohort models, Lee-Carter log-linear models and 
parameterisation models, all require at least one parameter to be forecast 
outside the projection model, before the forecast can be completed. Most 
authors use time-series methods to forecast this (these) parameter(s). 
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Sometimes (i.e. in the parameterisation models with time-dependent 
parameters) mathematical curves are used for extrapolating this (these) trend 
(s) of instead of time-series modelling. The formulation of the curves can be 
chosen to satisfy both past trends and future expectations on mortality. All of 
this, including the solution above, cannot be considered a good alternative to 
the process-oriented approach. However, the process-oriented modelling of 
mortality requires excellent individual-level data on risk factors and 
simultaneously on mortality and appropriate methodological approaches, 
neither of which are readily available at present. 

Therefore because of this, it seems inevitable to continue forecasting mortality 
using extrapolation rather than process-oriented methods. Any method 
discussed in Chapter 1 is a good choice. What is important is to revise the 
ways of applying these methods in order to produce good forecasts. The risk 
of improper extrapolation can be decreased by introducing new approaches 
into the forecasting. In extrapolation, trends in mortality by cause of death are 
particularly useful, for three reasons. Firstly, because they are clear and 
secondly because epidemiological knowledge can be used when formulating 
assumptions for the future. Moreover, one can also envisage using cohort 
mortality trends more extensively in forecasting. Thirdly, perhaps the 'key 
parameters to forecast aside' should not be extrapolated using only time-series 
methods but rather be modelled using alternative approaches and additional 
information. 

With respect to international practices in predicting mortality, usually only 
projections and scenarios are presented and not forecasts. This is related to the 
fact that countries themselves usually prepare the forecasts. Formal modelling 
has not been popular practice within international organisations. They rely on 
past trends in mortality and on expert judgement of future trends. T hi s 
concerns not only the ultimate baseline level of life expectancy but also the 
uncertainty of international projections or scenarios, which is commonly 
expressed as low and high alternative variants of mortality changes. 

Finally, interpolation has always been popular in predicting mortality amongst 
both scientists and practically oriented organisations. Interpolation could 
become even more useful if targets were selected in scientifically justified 
reliable procedures. Selecting targets can be seen as making estimates of 
country-specific temporary maximum levels of life expectancy. This aspect of 
mortality research needs new inputs as past results have become increasingly 
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meaningless (e.g. Manton et al. (1991)). Conclusions from longevity research 
would constitute an invaluable source of in formation for such estimates. In 
addition, the demographic evidence about (sex, age, cause of death, 
socioeconomic status and marital status) differentials of mortality could be 
incorporated into such modelling approaches. 
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Abstract 

This chapter discusses epidemiological models that take disease processes and 
related risk factors as the basis for modelling mortality and morbidity. These 
models can be roughly divided into two groups: statistical regression models 
and dynamic multistate models. For each group of models, examples are given 
for infectious and chronic diseases. Strong and weak aspects of each group of 
models are summarised in the context of their aims and data requirements. 

The chapter consists of four sections. Section 2.1 is an introduction. In Section 
2.2, devoted to the regression models, the Alderson and Ashwood model for 
prediction of lung cancer mortality in England and Wales and the Murray and 
Lopez study of the global burden of disease are discussed. Section 2.3 focuses 
on dynamic multistate models. In Section 2.3.1 the (macrosimulation) method 
of back-calculation of HIV/AIDS-related mortality, and a (microsimulation) 
model of the spread of two sexually transmitted diseases, gonorrhoea and 
chlamydia, are given as examples of the multistate models for infectious 
diseases. The Dutch model PREVENT and the risk factor intervention models 
of Manton and colleagues are reported in Section 2.3.2 as examples of the 
multistate models of chronic diseases. Finally, the models MISCAN (Erasmus 
University, Rotterdam) and POHEM (Statistics Canada) illustrate the 
microsimulation approach in multistate modelling of chronic diseases. The 
final section (2.4) discusses the use of epidemiological models for research and 
health policy purposes. 
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2.1 I Introduction 

In the previous chapter, several demographic models for projecting mortality 
were reviewed. From an epidemiological point of view, patterns of morbidity 
and mortality are primarily ‘explained’ by the distribution within populations 
of risk factors, such as smoking, dietary habits or physical inactivity (lifestyle), 
socioeconomic variables or environmental exposures. Quantitative models 
have been developed for statistical analysis of the associations between 
explanatory variables and the risks of morbidity and mortality in 
epidemiological studies. These models can be generalised to predict mortality 
and morbidity risks in other populations, for instance to assess the effects of 
risk factor prevention programmes or treatment interventions. Figure 2. 1 gives 
an overview of classes of the explanatory variables distinguished (Ruwaard 
and Kramers, 1998). 

Over the last 150 years, life expectancy has increased substantially in the 
Western world, from around 30 to around 80 years, mainly as a result of 
successful prevention of environment-related infectious diseases as an 
important cause of mortality at younger ages. Concomitant with economic 
development, safe drinking water facilities and sewage control were 
introduced, which in turn improved nutrition and personal hygiene. From the 
third decade of the twentieth century, vaccination and antibiotics added another 
20 years to life expectancy. In the present situation, the occurrence of 
(chronic) disease is postponed to the later stages of life, involving different risk 
factors. Murray and Lopez have shown that this epidemiological transition 
from infectious diseases to chronic diseases as the leading cause of death is 
also taking place in most other countries (McKeon, 1976; Bakkes and 
Woerden, 1997; Mackenbach, 1988 and Murray and Lopez, 1997d). 

Modelling of infectious diseases has a longer history than modelling of chronic 
diseases. From the end of the nineteenth century, epidemiological research 
into the etiology of infectious disease has provided insight into the causative 
agents and determinants of infection. The first generation of simple 
deterministic infectious disease models emerged between the two World Wars 
in the first half of the twentieth century. From the 1950s, more complex, 
stochastic models were developed to deal with the variability of the spread of 
infectious diseases (King and Soskolne, 1988; Susser and Susser, 1996). 
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Figure 2.1. Classes of determinants of health status 



Health policy 




Health status 



Source: Ruwaard and Kramers (eds.), 1998. 



Insights into the etiology of chronic diseases were not gained until the 1960s, 
and after. Most attention has been given to cardiovascular diseases and cancer, 
the main causes of death in the Western world today. Since then, many large 
studies have been organised to assess the relationship between lifestyle factors 
and other personal characteristics and the risks of mortality and morbidity, 
such as the Framingham Study, the M(onitoring) R(isk) F(actors) 
I(ntervention) T(rial), and the Whitehall Study (Susser and Susser, 19% and 
Pierce, 1996). 
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Nowadays, health status is regarded as a function of many interacting 
determinants, both endogenous, such as genetic constitution, and exogenous, 
such as lifestyle, socioeconomic factors and environmental exposure (see 2.1). 
In view of this complex and multi-causal nature of disease genesis, it is often 
difficult to determine whether statistical associations found in epidemiological 
studies are indeed causal (Rothman, 1986). Several criteria have been defined 
to evaluate the causality of statistical association found in epidemiological 
studies, such as strength, consistency, specificity, temporality, biological 
gradient, plausibility, coherence, experimental evidence and analogy (Hill, 
1965). Obviously, similar difficulties with respect to the (multi-) causal nature 
of disease arise when modelling (chronic) disease. 

In this chapter we will review epidemiological models that have been 
developed to describe the relationship between risk factors and morbidity and 
mortality. The common characteristic of these epidemiological models is the 
explicit role of time. The models can be roughly divided into statistical 
regression models and dynamic multistate models. The main output variables 
of regression models are mortality rates, probabilities and relative risks. 
Multistate models yield disease incidence numbers and cause-specific or total 
mortality numbers. Statistical regression models are generally used to estimate 
model parameters from specific epidemiological studies or surveys. Dynamic 
multistate models are predominantly used to generalise information from 
specific studies to other contexts, often integrating data from different sources: 
e.g. assessing the health benefits of options for public health interventions or 
forecasting the future health burden. 



2.2 I Statistical Regression Models 

Statistical regression models are used to describe the relationship between 
health risks (the outcome variable) and explanatory variables. Note that the 
term health stands for any type of event here, for example mortality risk, 
disease incidence, disease state transition or re mi ssion. The most co mm on 
example is the li near regression model in which the outcome variable is related 
to the explanatory variables through a linear regression model. However, in 
life sciences the assumptions underlying the linear regression model are rarely 
satisfied. For example, outcome variables such as mortality risks and numbers 
are not normally distributed. Other types of regression models have therefore 
been developed that better fit the situation. These models are known as 
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generalised linear models (McCuUagh and Nelder, 1989). We shall describe 
some examples of these models here: the Cox proportional hazards model, the 
Poisson regression model, the logistic regression model, and the accelerated 
failure time model. 

The proportional hazards regression model has become very popular since 
Cox published his article on regression models and life tables (1972): 

h(t; x) = X(t) exp( x) 

With; h being the hazard function, t representing time, X covariate vector, A. 
base-line hazard function, and P regression parameter vector. In the Cox 
model, the mortality hazard rate is assumed to be the product of a non- 
parametric baseline hazard function and a hazard ratio. The baseline hazard 
function describes the change in the hazard function over time, the hazard ratio 
describes the dependency on the covariates through a regression model. The 
main model assumptions are that the mortality rates for different risk factor 
levels are proportional, i.e. constant over time, and that the effects of the 
covariates on the hazard rate are multiplicative. 

The Poisson regression model describes mortality numbers, given the 
population numbers initially at risk specified by risk factor level. The mortality 
numbers are assumed to be Poisson distributed, with the expected value being 
dependent on the risk factor levels through a log-linear regression model. The 
log-linear regression model can be interpreted as a relative risk, namely the 
ratio of the probability of dying for given risk factor values compared with the 
baseline level. 

The logistic regression model describes the proportion of the population dying, 
depending on the initial population characteristics. For each person, the 
outcome ‘deceased’ is assumed to be binomially distributed. The probability of 
dying is ‘regressed on’ the risk factor levels through a logistic link function. 
The regression parameters can be used to calculate mortality odds ratios. 
These are the ratios of the odds of the mortality risks for given risk factor 
levels compared with the odds for the baseline level. 

Another example of a regression model used in epidemiology is the 
accelerated failure time model. In this model the stochastic time to mortality, 
or any other health event, is described as the product of a baseline time 
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variable and a time multiplicator term. The baseline time variable is generally 
assumed to be Weibull-distributed. The time multiplicator term depends on the 
risk factor levels through a log-linear link function. Increasing risk factor 
levels result in higher values of the multiplicator term, and so in shorter times 
until mortality. 

Several problems have to be dealt with when fitting multivariate generalised 
linear models. Some of them are typical of aU types of multivariate models, 
others typical of the application in epidemiology. Examples are the selection of 
explanatory variables in the models, the interpretation of missing values, 
different observation time periods of individuals, the effect of confounding 
variables, and the general nature of the model results. One common 
characteristic of statistical analyses in epidemiology is the main interest in the 
relative effects of the explanatory variables on mortality. For example, 
whether and, if so, how changing risk factor levels result in changing mortality 
or morbidity risks. Baseline risk levels or intercept values are therefore often 
not presented in scientific articles. However, without these parameters it is 
impossible to calculate absolute mortality risk values. It is generally assumed, 
albeit often without further validation, that the regression parameter values, 
just like relative risks, can be generalised to other populations. 

Example 2.1: Projection of Lung Cancer Mortality Rates for the Elderly 

Alderson and Ashwood (1985) have developed a regression model to calculate 
future mortality from lung cancer among the population aged 60-84 in England 
and Wales. Their work was based on the models presented by Doll and Peto 
(1978) and Townsend (1978). The lung cancer mortality rate was found to be 
linearly dependent on daily cigarette consumption, and also monotonously 
dependent on the duration of smoking, although non-linearly. The final model 
was the following: 





K 






lung 


duration of smoking 


daily cigarette 


non- 


cancer = constant 


X adjusted for 


X consumption 


+ smokers 


death rate 


proportion of smokers 


per adult 


comp. 



Ecological data have been used to fit the model: lung cancer and smoking data 
have been used from different studies. Data on smoking behaviour were 
available from the Tobacco Research Council and the General Household 
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Survey. Data on consumption by age and sex between 1946 and 1975, and 
total sales data going back to 1870 were thought to be reliable. Data on the 
proportion of smokers by age and sex before 1948 were lacking and these had 
to be estimated from data on the age at which smokers and ex-smokers started 
smoking. These estimates appeared to be reasonably reliable for male smokers 
but were less reliable for women. The model fitted the data on national lung 
cancer death rates and estimated cigarette consumption reasonably well, with 
parameter k equalling four. For women, 86 per cent of the variation was 
explained, for men 77 per cent. The fit was not as good for those aged 75 and 
over, when the model consistently predicted higher death rates than those 
actually recorded. This was explained by underregistration of lung cancer as 
the primary cause of death for the very old. 

The model has been used to estimate future lung cancer mortality rates 
assuming two different future smoking patterns: a pessimistic scenario, 
assuming a continuation of current smoking patterns until 2025, and an 
optimistic scenario, assuming that the proportion of smokers in each age group 
will decline at a rate of ten per cent every five years. For men, the projected 
mortality in each age group was substantially reduced in both scenarios. For 
women, only in the age group 60-69 did the optimistic scenario result in 
decreasing lung cancer mortality rates, while in all other age groups an 
increase was predicted. These results reflect the different patterns of smoking 
among males and females in the past (see Table 2.1). 

Example 2,2: Mortality Projections in the Context of Assessing Burden of 
Disease 

In 1992, the World Bank initiated the Global Burden of Disease Study (GBD), 
coordinated by Christopher Murray and Alan Lopez. Its primary goal was to 
describe the health situation in the world, i.e. providing information on fatal 
and non-fatal health outcomes, and applying a measure to combine the burden 
of disease through morbidity and mortality. For mortality estimates, data from 
vital statistics and sample registration systems were combined with the results 
of population monitoring laboratories and disease-specific epidemiological 
studies. The basic form of the model used is presented in Figure 2.2. 
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Table 2.1. Actual and projected mortality from lung cancer by age and sex, England and 
Wales, 1951-2025 (rates per million) 





6064 


65-69 


70-74 


75-79 


80-84 


Males 

1951-55 


2,577 


2,947 


2,645 


2,085 


922 


1981-84 


2,998 


4,562 


6,358 


7,675 


8,293 


2021-25 optimistic 


820 


1,310 


2,010 


2,980 


4,420 


2021-25 pessimistic 
Female 


1,480 


2,050 


2,950 


4,030 


5,430 


1951-55 


288 


356 


394 


436 


401 


1981-84 


944 


1,248 


1,379 


1,143 


1,323 


2021-25 optimistic 


650 


1,020 


1,480 


2,150 


3,180 


2021-25 pessimistic 


1,040 


1,550 


2,100 


2,820 


3,710 



Source; Alderson and Ashwood, 1985. 



Figure 2.2. Global burden of disease: model used in modelling mortality and morbidity 



Distal socio- 
economic causes 



Proximal biological 
causes 



Outcomes 




Time 



Disability 



Source: Murray and Lopez, 1997b. 



The model variables selected were three distal socioeconomic determinants of 
mortality, and one powerful risk factor. Socioeconomic (distal) determinants 
of disease were represented by income, education and time, the latter as an 
indicator of the development of medical technology. The only lifestyle factor, 
for which valid data were found to be available world-wide, was smoking 
(Murray and Lopez, 1997c). 

A regression model was used to describe the relationship between the mortality 
risks and the explanatory variables: 












2. A review of epidemiological approaches to forecasting mortality and morbidity 



41 



- P, lny+ P,(lny)^ + p, \nE+ P, lii5/+ P/+C„,*, 

with: M the mortality rate in age group a, gender k, and cause of death i. Y, 
E, SI and T denote GDP per capita, education, smoking intensity, and time, 
respectively. Causes of death were grouped into 9 clusters for 14 gender and 
age groups using mortality data from 47 countries over the years 1950-1990. 
Population morbidity numbers were assumed to be proportional to mortality 
numbers, with proportionality coefficients taken constant until 2025. Several 
scenarios with respect to income, years of education and smoking behaviour 
were developed to predict future mortality and morbidity. 

To make morbidity and mortality compatible for different causes and between 
different countries, the concept of Disability Adjusted Life Years (DALY) has 
been used. DALY-coefficients have been defined that measure the impact of 
morbidity and mortality in terms of disability. They vary between 0 (good 
health and so no loss of daily abilities) and 1 (death). 

Some of the main findings of the GBD project were: 

• the burden of mental illness, such as depression, alcohol dependence and 
schizophrenia has been underestimated; the same applies to injury, violence 
and suicide (Murray and Lopez, 1997b); 

• health in former socialist economies is much worse than expected, 
especially among the epidemiological transition is already well-advanced in 
most parts of the world, suggesting that public health policy traditionally 
aimed at infectious disease might be obsolete (Murray and Lopez, 1997d); 

• men (Murray and Lopez, 1997a); 

• the increase in life expectancy also entails an increase in healthy life 
expectancy (Murray and Lopez, 1997a). 

The projections for the next century show that life expectancy will continue to 
increase (79 and 88 years for men and women, respectively, born in 2020 in 
the Western world). The importance of infectious diseases is gradually 
decreasing in favour of chronic diseases, partly due to changes in age 
distribution. Old problems, such as poor mental health, smoking-related 
diseases and traffic injuries are set to get worse. 
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2.3 I Dynamic Multistate Models 

Other model types used in epidemiology to describe morbidity and mortality 
are so-called dynamic multistate models. These models can be interpreted as 
being extensions to the life table method. Dynamic multistate models describe 
more event types than just total mortality, for example disease incidence or 
change in risk factor levels. For any transition from one state to another, the 
so-called state transition rates are introduced. For example, the mortality rate 
is the transition rate from the state ‘being alive’ to the state ‘deceased’. 
Dynamic multistate models have been introduced for several reasons. They 
mimic the nature of demographic and epidemiological processes better than 
traditional regression models. They provide a framework to combine data 
from different sources to describe different aspects of morbidity and mortality 
simultaneously over time and age. 

One general assumption of most dynamic multistate models is the so-called 
Markov property, meaning that the model states distinguished contain all 
information from the past, and that given the current model state, future model 
behaviour is independent of the past. 

The so-called microsimulation concept provides a framework for modelling 
processes including past life histories, thus avoiding the Markov assumption. It 
is a technique based on generating large numbers of random life histories by 
Monte Carlo techniques. These life histories are generated in the form of 
stochastic continuous processes and stochastic events, such as, for example, 
risk factor levels, marrying, disease incidence and mortality. Populations are 
simulated by generating sets of life histories. These sets not only provide 
information on the mean population values of demographic and 
epidemiological variables, but also on their distributions. With the steadily 
increasing computing power, computational disadvantages of microsimulation 
become less important since large numbers of histories can be simulated within 
reasonable time limits (Van Oortmarssen, 1995a). Microsimulation provides a 
simple method to avoid the Markov assumption, since separate life histories 
are generated for individuals. 
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2.3.1. Models of infectious diseases 

Infectious disease models describe the process of the outbreak and spread of 
infectious diseases in closed populations. They are often referred to as SIR- 
models because of the three model states distinguished, namely the states 
‘susceptible’, ‘infected/infectious’ and ‘remission’. A susceptible person is 
disease-free, and can become infected. An infected person carries the disease 
and can convey it to other susceptibles. Following the period of infection, the 
person recovers from the disease and becomes immune. After loss of 
immunisation, the process starts again. Because the number of infected 
persons initially tends to be small, and the spread may depend on individual 
risk behaviour, many infectious disease models are microsimulation models. 

We shall describe two representative examples of infectious disease models: a 
model of the spread of fHV/AIDS that applies so-called back-calculation 
methods, and a microsimulation model of the spread of gonorrhoea and 
chlamydia infections. 

Example 2.3: Back-Calculation of HIV/AIDS 

Downs et al. (1997) developed a model to reconstmct the HIV epidemic and 
to provide forecasts of AIDS incidence among adults in several European 
countries. The model is based on the concept of back-calculation. Data on the 
prevalence of infection with HIV are generally unavailable. However, new 
cases of AIDS are registered fairly accurately in most European countries, 
although often with some reporting delay. Given data on AIDS incidence and 
data on the incubation period of the virus, HIV prevalence can be estimated 
backwards. Assuming future trends on HIV prevalence, new AIDS cases can 
be predicted. Because the model results are highly sensitive to the assumptions 
on the incubation time period, and since HIV prevalence is unknown, reliable 
predictions can be given only up to five years ahead. 

The incubation period is described by a Markov-model containing seven 
stages of disease progression following Longini et al. (1993). Stages one to 
six correspond to pre-AIDS stages and stage seven to diagnosed AIDS. The 
model assumes that an infected person passes successively and irreversibly 
through all stages. The mean incubation period from infection to AIDS is 
10.3 years in the absence of treatment, and 13.0 years for persons treated 
from the start of stage four. The assumed transition rates from AIDS to death 
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correspond to mean survival times from AIDS of 2.0 years without treatment 
and 2.6 years with treatment. Estimates of the treatment effects were derived 
from a French hospital-based cohort study and applied to aU countries. 

Predictions of AIDS cases up to the year 1998, with approximate 95 per cent 
tolerance limits, were made under the hypothesis that annual HIV incidence 
from 1994 onwards would remain constant at the level estimated for 1993. 
Between 1994 and 1998, annual AIDS incidence had been predicted to 
increase by 24 per cent in the EU and by 48 per cent in the low-prevalence 
countries (see Figure 2.3). The expected increases differ by transmission 
group and by country. 

Example 2.4: A Microsimulation Model of Gonorrhoea and Chlamydia 

A simulation model has been developed that describes the spread of two 
sexually transmitted diseases, i.e. gonorrhoea and chlamydia (Kretzschmar et 
al., 1996). The model has been used to compare the effects of a number of 
prevention and intervention measures. It is a Markov-type model with one-day 
time steps, describing the formation and dissolution of relationships and the 
transmission of the disease as stochastic processes. This means that people can 
be in different model states and can move to another state during a one-day 
time step. Model states have been defined according to the status of the partner 
relationships, sexual activity, and the state of the infectious disease. 

The model population for the analyses described here consisted of 10,000 
heterosexual individuals aged 15-64 years. Long partner relationships (with 
mean value 6.9 years) and short relationships (with mean value ten days) were 
distinguished, with different sexual activity frequencies (0.25 per day and one 
per day, respectively). A distinction was also made between younger persons 
(up to age 35) and older persons. Data to estimate the model parameters were 
taken from a national survey on sexual behaviour and, for the sexually active 
‘core’ group, from a regional registration. European data were used on the 
mixing of persons of different age classes. 

Four prevention strategies were compared to one reference strategy (see 
Figure 2.4). The alternative strategies entailed tracing the sexual contacts of 
infected persons, screening of subgroups, and use of condoms. The effects of 
the strategies are defined as the relative decrease in the prevalence of 
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Figure 2.3. Estimated total annual AIDS incidence in the European Union 1981-1998 
among adults and adolescents (pre-1993 case definition), with approximate Bayesian 
prediction intervals (1994-98) 




Source: Downs et al., 1997. 



Figure 2.4. Ejfectivity ratios of interventions on chlamydia 




S female, sympi 
[3 female, asympt 
■ male sympt 
□ male, asympl 



Notes: scenario 1: treatment of symptomatic infected persons, not partners; scenario 2: 
scenario 1+25 per cent of partners; scenario 3: scenario 1 + annual screening of 20 
per cent of females aged 15-24; scenario 4; scenario 1 + screening of 50 per cent of 
‘core’ group twice a year; scenario 5: scenario 1 + 15 per cent of ‘core’ group and 6 
per cent of ‘non-core’ group use condoms. For each scenario, the prevalence ratio is 
defined prevalence relative to prevalence in scenario 1 . 

Source: Kretzschmar 1996. 
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gonorrhoea and chlamydia as a result of the strategies applied. It appeared 
that tracing and treating partners of infected persons was very effective in 
reducing prevalence. 



2.3.2. Models of chronic diseases 

Dynamic multistate models of morbidity and mortality from chronic diseases 
differ in some respects from models on infectious diseases. The model time 
span is generally much longer, since the time period between risk factor 
exposure on the one hand and morbidity and mortality from chronic diseases 
on the other may be considerable. The onset of chronic disease depends 
mainly on individual risk factor levels, not on interactions between individuals. 

However, heredity is becoming more important as an explanatory variable for 
chronic diseases and so introduces the concept of interactions between 
individuals in chronic disease models. 

There are many different types of chronic disease models. We present four 
examples here. (1) The PREVENT model describes cause-specific mortality 
numbers over time for given risk factor distributions. Disease morbidity has 
recently been included in the model. It has been used to estimate the effect of 
risk factor prevention programmes. (2) The Manton and Stallard model 
describes the stochastic change in risk factor levels over time and the 
interrelationship with mortality. It has been used to estimate the effect of risk 
factor interventions on life expectancy. (3) The MISCAN model describes the 
process of the onset of cancer, and has been used in decision-making on breast 
cancer screening. (4) The POHEM model is an example of a microsimulation 
model used to calculate morbidity and mortality from chronic diseases. 

Example 2.5: The PREVENT Model 

The PREVENT model has been developed as a tool for policymakers to 
estimate the public health effects of changes in risk factor prevalences, either 
autonomous or through interventions (Gunning-Schepers, 1988). The main 
data used are risk factor class-specific prevalence rates, cause-specific 
population mortality rates, relative mortality risks for all combinations of risk 
factor classes and causes of death, and initial total population numbers. Eor 
each year, cause-specific mortality numbers are calculated given the current 
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population numbers in aU risk factor classes distinguished. Several forms of 
time delay between risk factor changes and changes in mortality risks have 
been included in the model. 

The model simulates the development over time of two populations: one 
baseline population resulting in mortality numbers following current trends in 
risk factors and demography, and the other resulting in mortality numbers 
due to specific risk factor interventions. The differences between the two 
populations are the result of risk factor intervention. PREVENT uses two 
epidemiological measures, the relative risk (RR), as model input, and the 
potential impact fraction (PIE) as model output. The RR is defined as the 
ratio of the incidence rate among the exposed over the rate among the never- 
exposed. The PIF is the proportion of mortality prevented by risk factor 
intervention. 

The commercial version of PREVENT cont a ins the following causes of 
death: ischemic heart disease, cerebrovascular accident, chronic obstmctive 
lung disease, lung cancer, breast cancer, and traffic and other accidents, and 
risk factors such as cigarette smoking, hypertension, cholesterol level, diet 
and alcohol. The new scientific version of PREVENT Plus also describes 
disease prevalence by ‘subtracting’ given disease periods from the time of 
death (see also Back-calculation of HIV/AIDS). 

Example 2,6: The Stochastic Risk Factor Change Model 

Deter min istic models result in fixed future mortality numbers conditional on 
all model parameter values. The outcome of the microsimulation models are 
samples of mortality numbers. The distribution of the mortality numbers has to 
be estimated by drawing many samples. As a supplement to these two types of 
models, the stochastic risk factor change model describes the change over time 
of the distribution of stochastic risk factor levels and mortality numbers. 
Detailed information on the model can be found in Manton et al. (1993, 
1991), Manton (1992, 1991), and Manton and StaUard (1992). 

The stochastic risk factor change model describes simultaneously two 
interrelated processes, the change in continuous risk factor variables over time 
and survival, for all members of a cohort separately. The risk factor 
distribution function of the cohort changes as a result of individual changes, 
firstly because the individual risk factor levels change, and secondly because 
individuals with high risk levels have higher mortality risks than those with 
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low levels. The main model assumptions are the following: all risk factors are 
initially multivariately normally distributed, the change in risk factor levels is a 
stochastic process of linear deterministic changes (drift) and random changes 
(diffusion), and finally the mortality rate depends on the risk factor levels 
through a quadratic regression function. To take into account the uniform 
change in the mortality rate over time, all risk factor values are transformed 
using a baseline mortality rate, usually of the Gompertz-type. Mortality can be 
specified by cause of death. 

The model has been estimated using different datasets. By changing the 
parameters that govern the deterministic change in the risk factors (drift) 
and/or the parameter that governs the baseline mortality rate, the authors 
estimated the long-term effects of changing risk factors. A few results of these 
analyses are shown in Table 2.2. 

Life expectancies in the first block of Table 2.2 were obtained using data from 
the Framingham Heart Study. The results of risk factor interventions show the 
effects of holding the risk factor levels to the ‘optimum’ values, i.e. close to 
the values at age 30. In this way, life expectancies of 99.9 (!) years for males 
and 97 years for females were predicted for those who survived to age 30. 
These ‘optimum’ life expectancy figures are stiH 21 to 24 years lower than the 
highest achieved age confirmed for a human. It is worth noting that some 
religious populations with healthy lifestyles have life expectancies as high as 
the ‘optimum’ ones, and so the ‘optimum’ life expectancy is stiU not the 
highest possible level. 

Example 2.7: The MISCAN Model 

MISCAN is short for Microsimulation Screening Analysis. The model was 
developed in the Netherlands in the early 1980s (Van Oortmarssen et al., 
1982). The MISCAN simulation model can be used to analyse data from 
screening projects and to make predictions of costs and effects of alternative 
screening policies. The disease process is described in the form of a dynamic 
multistate model. The main characteristics of the screening programmes and 
screening tests are given in conjunction with the effect of early detection on 
prognosis. The basic model can be elaborated from a very simple to a rather 
complex structure (Van Oortmarssen, 1995a). Figure 2.5 presents the basic 
stmcture of the model. Costs are calculated in post-processing routines. 
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Table 2.2. Estimates of life expectancy with risk factor interventions 







Co* 


^65 


en** 


1. Framingham study 
Baseline 










Males 


44.3 


74.3 


15.0 


80.0 


Females 


49.9 


79.9 


18.8 


83.3 


With risk-factor interventions 










Males 


69.9 


99.0 


36.3 


101.3 


Females 


69.0 


97.0 


39.0 


104.0 


2. Framingham study (no sex) 
Baseline 


44.5 


74.5 






With risk-factors interventions 


57.3 


87.3 






3. Kaunas study (no sex) 
Baseline 


44.8 


74.8 






With risk-factors interventions 


50.1 


80.1 






4. Finnish east-west (no sex) 
Baseline 


39.3 


69.3 






With risk-factors interventions 


55.9 


85.9 







Note: Cq* for survivors to 30 years, eo** for survivors to 65 years. 



Sources: 1. Manton, Stallard, and Tolley, 1991: Estimation based on the Framingham Heart 
Study, U.S.; data assessed every two years between 1950 and 1984; 11 risk factors 
included (sex, diastolic and systolic blood pressure, smoking, vital capacity index, 
blood glucose, haematocrit value, body mass index, serum cholesterol, pulse rate, 
cardiac enlargement). 2. Manton, 1992: Follow-up time: 20 years, eight risk factors 
(pulse pressure, diastolic blood pressure, body mass index, cholesterol, blood sugar, 
haemoglobin, vital capacity index, smoking). 3. Manton, 1992: Estimation based on 
die Kaunas, Lithuania Study; follow-up time: 11 years; six risk factors (pulse 
pressure, diastolic blood pressure, body mass index, cholesterol, glucose tolerance, 
smoking). 4. Manton, 1992: Estimation based on the Finnish East-West Study; 
follow-up time: 25 years; seven risk factors (pulse pressure, diastolic blood pressure, 
body mass index, cholesterol, vital capacity index, smoking, pulse rate). 
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Figure 2. 5. Basic structure of a model for cancer screening 




<l) DETECTION 



(2) PROGNOSIS 



Source: Van Oortmarssen, 1995b. 



The outcomes of MISCAN were of great value to the decision-making process 
on breast cancer screening in the Netherlands. A two-year screening 
programme for breast cancer was implemented in 1998 for women aged 50- 
70. With a participation rate of 70 per cent, MISCAN calculated a reduction in 
breast cancer mortality of 17 per cent annually (800 women per year in the 
Netherlands). It is not yet possible to judge the effects of the screening 
programme, but monitoring will take place. Modelling of prostate and 
colorectal cancer screening was recently initiated. A completely revised 
version of MISCAN was constmcted for this purpose (Van Oortmarssen, 
1995b). 

Example 2,8: The POHEM Model 

The POHEM (Population HEalth Model) is a microsimulation model of the 
dynamics of disease and health-related characteristics among the Canadian 
population (Wolfson, 1991, 1994). Using post-processing routines, the 

morbidity and mortality outcome variables are combined in the form of a 
‘population health index’. This index is comparable to the concept of 
disability-adjusted life years, described by Murray and Lopez (see Section 
2 . 2 ). 

The POHEM model consists of several modules that describe specific 
demographic or epidemiological processes. Since the first version of the model 
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was formulated to simulate pensions, POHEM contains many more 
demographic and socioeconomic variables than most other multistate models 
used in epidemiology. It also explains another specific model feature: POHEM 
creates not just individuals, but ‘family stmctures’. Males and females are 
generated in pairs in anticipation of marriage or common law union. Children 
and remarriage partners are explicitly included in these stmctures as well. A 
case is completed with the death of the last adult. The demographic and 
socioeconomic variables and events distinguished are: educational attainment, 
first union, first and second spousal age difference, fertility, union dissolution, 
child custody, child leaving home, remarriage, labour force participation and 
labour market earnings. The risk factors included are: radon, blood pressure, 
obesity, smoking and cholesterol. The diseases included are: coronary heart 
disease, lung cancer, breast cancer, dementia and mortality from other causes. 



2.4 I Discussion 

Several observations can be made with respect to the use of epidemiological 
models in both scientific research and the process of policymaking. 

An important issue in the application of epidemiological models to support 
policymaking is the external validity of epidemiological data. To what extent 
can epidemiological insights derived from specific study populations be applied 
when making predictions with respect to other populations? In other words: 
Do risk functions with respect to risk factors such as body mass index, 
smoking behaviour, blood pressure, and semm cholesterol levels derived from 
the famous Framingham study (New England, USA) have validity for western 
European countries or, an even more problematic issue, for less-developed 
populations in the African Sahel? Most results from epidemiological studies 
are presented as relative mortality or morbidity risks. They are often 
implemented in epidemiological models irrespective of the absolute mortality 
risk rates, which may differ substantially from one population to another. It is 
clear that the validity of these types of ‘extrapolations’ can be justly 
challenged. To meet these objections, models are often used in the context of 
scenario studies in which alternative scenarios, for instance with respect to risk 
factor prevalence as a result of prevention programmes, are compared with a 
reference scenario. This type of comparative evaluation of public health 
measures may be valuable to policymakers. Having said that, transparency and 
uncertainty analysis are cmcial when presenting the results. 
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Nearly two centuries of successful preventive medicine and public health 
policy have stretched life expectancy in Western societies almost to the 
biological limits. As a result, the focus of public health has gradually shifted 
from life expectancy to health expectancy: postponing as long as possible or 
mitigating the physical, mental or social limitations brought about by the 
chronic diseases of older age. The outcome variables selected in recent 
epidemiological studies are not merely cause-specific mortality and morbidity, 
but increasingly also include subjective and objective measures of the quality 
of life. Likewise, many of the dynamic multistate models presented here not 
only describe mortality, but increasingly focus on morbidity and its effect on 
quality of life (disease burden). T hi s obviously increases the level of 
complexity and normative nature of the models, again calling for transparency 
and uncertainty analysis in the presentation of the results. 

Needless to say, the development of epidemiological projection models has 
been driven by etiological insights with respect to a limited number of risk 
factors and (chronic) diseases. It is important to note that in most cases these 
risk factors only explain part of the observed morbidity and mortality. The 
multi-causal nature of chronic disease thus calls for some caution with respect 
to epidemiological morbidity and mortality projections. 

An important issue in epidemiology is the problem of causality. Several 
criteria have been developed to evaluate the causality of associations found 
between risk factors and mortality risks. Causality, however, is not an absolute 
concept. For instance, many studies have shown that health status can be 
predicted very accurately on the basis of ‘material’ circumstances, such as 
residence or car ownership, voting behaviour (conservative versus labour), or 
professional status. However, it is clear that using these indicators as input for 
epidemiological models will not yield valuable findings for policymakers. 

A related problem is the hierarchy between explanatory factors. Figure 2.1 
illustrates the obvious dependency between exogenous risk factors, such as 
smoking behaviour, dietary habits and physical activity, and endogenous 
factors such as blood pressure, body mass index and semm cholesterol level. 
This clustering of risk factors on different levels may lead to substantial 
overestimation of the benefits of risk factor intervention, when the health gain 
is calculated for each risk factor separately. 
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In the last decade, obvious health differences between socioeconomic groups 
have attracted much attention. Socioeconomic status (SES) is often measured 
on the basis of level of education, income and professional status. Differences 
are to some extent explained by an unequal distribution of unhealthy habits, 
such as smoking, alcohol abuse, intake of fat, lack of fruit and vegetables in 
diet. Furthermore, social support, employment rates, number of life events 
and use of care facilities (in particular specialist care) appear to be less 
favourable among individuals of the lowest social strata. This unequal 
distribution of exogenous determinants across SES-groups is reflected in the 
distribution of endogenous (or intermediate) factors, such as hypertension, 
unfavourable blood lipoprotein composition, and obesity. However, this 
unequal distribution of risk factors does not explain the full extent of 
socioeconomic health differences. As knowledge on the causal pathways 
involved is stiU very diffuse and inconclusive, socioeconomic health 
differences are a typical example of a complex public health domain that is not 
yet ready for modelling. 

Historic validation of epidemiological models is very difficult. Morbidity and 
mortality are a result of many different processes during an individual’s 
lifetime. Moreover, the exposure distribution of the population to certain 
important risk factors, such as smoking, has changed substantially over the 
past decades. Several attempts to explain the observed changes in mortality 
risks by trends in risk factor distribution using epidemiological models have 
therefore not always been successful, primarily due to a lack of good historical 
data on the risk factors and disease prevalence rates. 

Epidemiological models provide a means to investigating the benefits of public 
health measures, such as prevention programmes. However, note that 
information on the effectiveness of these programmes, in terms of changing 
risk factor prevalences, is lacking. Many model studies estimate the potential 
health benefits using optimal levels of risk factor exposure. In other scenario 
studies, the reductions in risk factor exposure as a result of prevention 
programmes are based on educated assumptions, which tend to be rather 
optimistic. In reality, prevention policies that strive to change our ‘bad habits’ 
or to change our risk factor status in another way often prove to be far less 
effective. For example, despite repeated anti-smoking campaigns, smoking 
prevalence rates appear to be increasing among young people in the 
Netherlands. There therefore appears to be little chance of realising the goals 
that have been set in, for example, the Health For All 2000 campaigns. 
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An important issue when developing models is uncertainty and reliability or, in 
statistical terms, confidence. Although generalised linear models are more 
complex than standard linear regression models, statistical theory has provided 
methods to calculate confidence bounds or prediction intervals. These methods 
can be easily applied when fitting these models to specific data sets. In the case 
of dynamic multistate models, the issue of model reliability is much more 
complex. Data are usually derived from many different studies, some may 
even be derived from expertjudgements. Therefore, model uncertainty is often 
defined in terms of sensitivity of the model outcome variables to the model 
input parameters. Comparing the model results of different scenarios may also 
be regarded as an example of model sensitivity analysis. 

We may thus conclude that epidemiological models can be a useful tool for 
estimating (future) disease burden and associated demands for medical cure 
and care, as well as for assessing the effectiveness of prevention programmes. 
They should, however, always be interpreted with due care, given the many 
methodological limitations involved. 
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Part 2. Theoretical Perspectives 
on Forecasting 
Mortality 




3. A Regression Model of 

Mortality, with Application to 
the Netherlands 



Christopher HEATHCOTE and Tim fflGGINS 

Abstract 

Regression methods are used to model and estimate a measure of the mortality 
of a population as a function of time and age. The measure of mortality used is 
the logit (or log odds) of the cohort time and age-specific probability of death 
and it is shown how a parameterised model can be estimated by weighted least 
squares. The method is applied to historical 1890-1990 data of male and 
female mortality for ages 40 and above in the Netherlands. The fitted 
regressions provide the point of departure for the predictive model and 
forecasts developed in the next chapter. 



3.1 I Introduction 

The process of constmcting a forecast typically consists of two steps. The first 
is concerned with formulating and estimating a descriptive model of relevant 
historical data, and the second consists of combining the historical information 
with views of future developments to obtain a predictive model from which 
forecasts are derived. Following this two-stage procedure, the first step is 
implemented in the present chapter with the next chapter dealing with 
modifications to the descriptive model that seem necessary to obtain plausible 
forecasts of future male and female mortality in the Netherlands. 

The discussion of mortality presented here will be along lines familiar to 
readers acquainted with statistical and econometric modelling. It differs to a 
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Age(x) 



greater or lesser degree from the approaches usually adopted by demographers 
as, for example, by Alho (1990), Land (1986), Long (1984), Pollard (1987) 
and Willekens (1992). We introduce the notion of a mortality surface, written 
b{t,x) , which is a measure of the mortality of a population indexed by time t 
and age x. For example, 8(t,x) could be the logarithm of the central 
mortality rate at the point (t,x) in the Lexis plane. Thus 6(t,x) is a function on 
the Lexis plane that is assumed to be estimable from hi storical data. Figure 
3.1, with data supposed to be available for (t,x) in 
i(0) <t< i(l), x(0) < X < x(l) illustrates the point. The part of the Lexis 
plane used later in this chapter to model Dutch mortality is for the years 
1890<t<1990, and ages 40<x<94. An appropriate mortality surface 
(defined below) is estimated by standard regression methods. 



Figure 3.1. Lexis diagram. Historical data shown as a rectangle 
t(0j<t<t(l),x(0) <x<x(l). 

The cohort bom at time c lies on the diagonal commencing at (c,0) 




Time (t) 
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Many contributions concerned with forecasting in fact dwell mainly on the 
constmction and estimation of a model of historical mortality. What see ms to 
be the first use of modem statistical techniques is due to Cramer and Wold 
(1935) in their study of Swedish mortality during the years 18(X)-1930. For 
ages X from 30 to 80 they used the method of minimum chi-square to fit a 
Makeham curve A + to the central mortality rate m(t,x) for each year 

t. The next step was to fit curves describing the variation of the three 
parameters A, B, C with time. They thus obtained an estimate of what can be 
called the mortality surface A{t) + for the years 

1800<i<1930 and ages 30<x<80. The formula of the estimated 
mortality surface was then used as the basis for forecasts, by extrapolation. 

Pollard (1949) reviews early work on the estimation of parametric models of 
mortality, including that of Cramer and Wold alluded to above, and also of 
Rhodes (1941) who, using the work of Kermack et al. (1934), estimated a 
different parametric model of 19th century mortality in England and Wales. 
What is evident from this pioneering work is the possibility of a variety of 
useful models, all requiring fairly complicated non-linear methods of 
estimation, and the necessity of extensive historical data. It is worth observing 
that this work was essentially data-analytic in nature, preoccupied with difficult 
technical curve-fitting problems, and did not take into account further 
contextual considerations, such as the possible independence of cohorts or the 
heterogeneity of the population, that may be relevant to the specific problem of 
modelling mortality. 

More recent authors using a data-analytic approach have inc li ned to time series 
methods in the spirit of Box and Jenkins (1970). Models have been proposed 
in terms of period mortality vectors: 

5(0 = S(P;0> + 

The elements of each vector are the age-specific entries for year t and [3 is a 

vector of parameters. McNown and Rogers (1989) fit a Heligman-Pollard 
(1980) eight parameter formula to mortality schedules for each fixed year and 
then fit univariate ARIMA models to the eight time series of estimated 
parameter values. Vector autoregressive and moving average models are 
applied by Hagnell (1991) to Swedish data and a variant of this approach due 
to BeU and Monsell (1991) reduces the dimensionality of the problem by the 
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use of principal components. Lee and Carter (1992) model the central time- 
and age-specific death rate m(t,x) to fit what in effect is the mortality surface: 

logm{t,x) = a(jc) -t- b{x)k{t), 

with k{t) a random walk with drift: 



k{t) = c + k{t~\)-^ e(0 = ct + ^ e(j) , 

>1 

and the e(j) being independent identically distributed random variables. 

Manton and Stallard (1992) use a multivariate autoregression to describe the 
process of ageing and mortality. Likelihood methods are applied to estimate 
parameters of the model from data obtained from the Framingham survey. An 
advantage of their approach is that the effect of risk factors can be studied in 
some detail; a disadvantage is that the data requirements are heavy and 
unlikely to be satisfied except in special circumstances. 

The approach to modelling mortality presented here is different in the sense 
that assumptions about the stochastic processes of life histories provide our 
point of departure and we develop the primitive model of Heathcote and 
McDermid (1994). Thus birth cohorts play a prominent role and processes 
along the diagonals of the Lexis plane are the basis of the modelling of a 
mortality surface. The result is the regression model described in the next 
section. It is heteroscedastic but with approximately normally distributed 
independent errors and is estimated by iteratively reweighted least squares. 
Sections 3.3 and 3.4 apply the model to mortality in the Netherlands, 1890- 
1990. The following chapter argues that the fitted model must be modified to 
obtain plausible forecasts and shows how this can be done. 



3.2 I A Regression Model of Mortality 

With 6(i,x)a measure of mortality at the (time, age) point {t,x) , consider: 



Y{t,x) = 6{t,x) + t{t,x) . 



(3.1) 
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Here Y{t,x) is an observable estimator of8(t,j!c) and e(t,x) is a random 
error. Models such as (3.1) immediately raise two questions; what is a useful 
formula for the deterministic mortality surface h{t,x), and what is the 
stochastic stmcture of the errors s(t,x)7 It will emerge that mortality surfaces 
which are smooth functions of the time- and age-specific probability of death 
are convenient and informative and also lead to the desirable property that the 
errors z{t,x) are approximately independently and normally distributed, albeit 
with different variances. In these circumstances standard methods for the 
analysis of heteroscedastic regression models can be applied to estimate a 
parameterised mortality surface and this is the approach that is developed here. 
The arguments presented here do not consider migration and should be placed 
in the context of a life table methodology for closed populations as developed 
in Chiang (1968, 1984). 

First, some notation. Let: 

f(t,x) = number of lives of exact age x at exact time t 

= number of lives of exact age x of the cohort bom at time t - x . 

With c = t - X write: 

i^{x) = £(t,x) 

to denote the number of individuals in cohort c who survive to age x. Note 
that: 



^^(0) = number of live births at c 
= initial size of cohort c . 

For the sake of convenience, we work with an integer grid of (t,x) points. 
Then the time- and age-specific probability of death is: 



q{t,x) = probability that an individual alive at (t,x) will die within one year 
= \l{t,x) - £{t -t- l,x l)]/i’(t,x) . 
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Note that q{t,x) is calculated along diagonals of the Lexis plane. In cohort 
notation, with c-t-x. 



qXx) = q(t,x') 
= 1-expi 



X-¥\ 

- \\i-M)du 



where (x) , 0 < x , is the force of mortality of cohort c . 
Knowledge of q^ (x) is equivalent to knowing the integrated force: 



^+1 

j\x^{u)du , and whilst not providing as much information as p^(x) itself, 

A' 



nonetheless permits useful calculations. For example, if y is a positive integer 
the conditional survival curve from x to x + y for cohort c is: 



Pr (survive to age x + y I survived to x ) = exp 



x+y 



- jfx^(u)du 






k=0 



Thus knowledge of these conditional probabilities enables the calculation of 
life expectancies and other measures over sets of integer ages. 

Two useful functions of q(t,x) are first, an estimate of p^(x) by the central 
rate: 



m^{x) = 



'^-qM) 

'2-q{t,x) 

2-q{t,x) 



m{t,x) 



if c-t-x , 



and second, the logistic or log(odds) transformation. It is the second. 




3. A regression model of mortality, with application to the Netherlands 



65 



8(f,x) = log 



q{t,x) 



logit q(t,x) = log(odds), 



(3.2) 



that is defined as the mortality surface used in this study. Whilst any measure 
of mortality that varies with time and age can be used to define a mortality 
surface, the quantity defined by (3.2) possesses certain advantages. It is a 
differentiable monotonic function of q(t,x) that varies between —go and co 
and is familiar from extensive use in statistical practice. Thus we consider the 
regression (3.1) with 5(t,x) given by (3.2). 

The probabilities q(t,x) must be estimated from numerical data, typically 
counts of the numbers of lives at points {t,x) of a portion of the Lexis plane. 
These counts are interpreted as realisations of random variables. Random 
variables will be denoted by writing a tilda over a latin letter, thus: 

l{t,x) is the random variable whose realisation is the observed number of 
lives of exact age x at exact time t . Another random variable is 

q(t,x) = \l{t,x)-l.{t + \,x + \)^l{t,x) . 

The corresponding population quantities i{t,x) , q{t,x)w& unknown and 
numerical estimates of them are given by realisations of t{t,x), 
q{t,x) respectively. 

Realisations of random variables wiU carry a double tilda. Thus realisations of 
i{t,x) and q{t,x) are respectively: 



i(t,x) - realisation of £{t,x) - number of observed survivors at (t,x), 



q{t,x) = 



i{t,x) - i{t + l,x + 1) 



i{t,x) . 



Typically, data are available in the form of numbers £{t,x) of survivors for 
(t,x) in a portion of the Lexis plane. 

Under generally accepted assumptions, the counts i{t,x) are binomially 
distributed. This is the classical point of departure of a discussion of the 
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probabilistic properties of the life table (Chiang, 1968) and depends on two 
important assumptions: 



Assumptions 



(i) Cohorts are statistically independent as far as mortality is concerned, i.e., 

in the regression (3.1) the error random variables s(c, = - x,x) and 

s (Cj = ^2 ~ x) are independent when c, Cj . 

(ii) Within a cohort the lifetimes of individuals are independent and identically 

distributed, i.e. for cohort c the ^^(0) lifetimes of individuals born at 
c are independent with common distribution function: 



1-exp 






= ^-Pc(x) 



where p^(x) = i^(x)f£^ (0) . 



The second assumption is the classical assumption of homogeneity within a 
cohort and leads to the binomial distribution of survivors (see, for example, 
Chiang, 1968, ChlO). This follows from properties of the indicators: 

j('^) = i ^ individual r of cohort c survives to age x 

w otherwise 



with mean and variance: 

EI<^>(x) = p, (x) , Varli^\x) = /;,(x)[l -p,(x)] . 

fJO) 

But, £^(x) = ^E/^(x) is the number of survivors of cohort c to age x and 

r=l 

therefore, under Assumption (ii), is binomially distributed. Note that failure of 
Assumption (ii) will generally lead to a different distribution for the number of 
survivors. 

Assumption (i) implies that (x) , 4^ (^) independent when c, ^ and 
hence that there are generally different independent binomial distributions for 
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the numbers of survivors along distinct diagonals of the Lexis plane. This is 
the important consequence of the classical Assumptions (i) and (ii) since these 
binomial distributions of the l^{x) determine the underlying probabilistic 
structure of the regression (3.1). 

If Assumption (ii) does not hold, the heterogeneity of the indicators I^/*(x) , 
r=l,2,..., raises important and difficult questions, and will likely give 

rise to the phenomenon of extrabinomial variation (see eg. McCullagh and 
Nelder, 1989; Williams, 1982 and also Pollard, 1970). Perhaps the most 
appropriate description of heterogeneity in the context of mortality is the 
cluster model in Section 4.5 of McCullagh and Nelder (1989). This imphes 
that i^(x) is distributed as the sum of independent but different binomial 
random variables and requires disaggregated data (eg. cause of death) for 
detailed estimation. It is beyond the scope of this chapter to embark on a 
detailed discussion of the problem and in fact it seems to have only limited 
impact in the particular case of Dutch mortality considered here. Recall that 
only aggregate data are available so that the nature of the heterogeneity, if any, 
cannot be directly estimated. We will therefore assume that Assumption (ii) 
fails only to the extent of leading to extrabinomial variation for which a 
correction must be made to the standard errors of estimates. 

With c = t - X consider the regression (3.1) in cohort form: 

7,(x) = 5,(x) + 8,(x) 

with the mortality surface of (3.2) written: 

^c(^) ~ ^c(-^ + ^) 

i^{x + l) 

The distribution of the errors s ^ (x) is found by the following argument. 

With (x) the random variable corresponding to the mortality surface 8 ^ (x) 
we have on rearranging terms: 



8,(x) = logit q^(x) = log 



^.W-5,(x) = log 





-log 


£,(x + l) 


_^,(x)-f^(x + l)_ 




J^(x + 1)_ 
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+ 

1 

1 




l,{x + l) 




_/,(x)-/^(x-fl)_ 




i,(x + l) 



in probability as ^(0) oo . Thus the large sample distributions of the residuals 

(x) depend on the joint distributions of i{x) and l{x + 1) . 

Under Assumption (ii) the latter is easily found to be of a compound binomial 
form with moment generating function: 

£[exp[r,7(x) + rj{x + 1)J= [l - (x) + \p^ (x) - p, (x + \)Y' + p,{x + 

Here, as before, p^i^) - ^ *^he probability of a member of 
cohort c being alive at age x . An assumption other than (ii) will lead to a 
different distribution of frequencies and hence a different distribution of 
e,(x) . 

A standard argument using the moment generating function shows that the 
e ^ (x) are asymptotically normally distributed as ^(0) — >• oo . Further, the 
result of pages 227-228 of Chiang (1968) implies asymptotic independence for 
different ages. Returning to the (t,x) notation, the result is: 

For large f(0) the residuals 8(f,x) of (3.1) with mortality surface (3.2) are 
approximately normally and independently distributed with zero means and 
variances: 



Van, {t, x) = CT ^ (t, x) = 



1 

i(t,x)-£{t l,x -I- 1) 



1 

H . 

i(t + 1, X + 1) 



The form of the error variance <j^{t,x) suggests the use of weighted least 
squares or even maximum likelihood to estimate a parameterised mortality 
surface: 

5(t,x) = 6((3;t,x) . 



A natural parameterisation to use if x varies over aU ages is that due to 
Heligman and Pollard (1980). However, there are then a large number of 
parameters to be non-linearly estimated when the number of cohorts is large. 
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To reduce the scale of the es t imation the application treated here is for ages 
X > 40 since in that case 8(P;f,x) can be taken as linear in the elements of 

P- 

Minimising the weighted least squares loss function: 

[y(t,x) — 5 (P x)] 



will yield the estimate p of P . A check for extrabinomial variation can be 
made by calculating: 



n — p 



II 



y(i,x)-8(^;./,x) 



a ^ (i, x) 



= <l> 



(3.3) 



where n= total number of data points used and p = number of parameters 
estimated. 

The summands are the squares of standardised residuals and if Assumption (ii) 
holds (|) should be close to unity. Thus extrabinomial variation presents a 

problem only when (|) differs from 1 in which case ([) a ^ (i, x) is taken as the 
final estimate of the error variance Vare(t,x). It is this quantity that is used to 
obtain standard errors of the estimated regression. 

We should comment that a more detailed exa mi nation of extrabinomial 
variation would proceed by examining heterogeneity both within and between 
cohorts. This could lead to different weights for different cohorts. What use of 
(3.3) does is to provide a rough global numerical correction that does not 

change the value of P but does alter the standard errors. The method is now 
applied to the mortality of Dutch males and females. 



3.3 I Mortality in the Netherlands: Formulating a Model 

The first step is the calculation of the realised numbers £(x) of survivors at 
(time, age) points (t,x) for which data are available. NIDI, courtesy of E. 
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Tabeau, supplied end-of-year population numbers E{t,x) and numbers of 

deaths d{t,x) by year and sex from 1850 to 1990. Calculations were made 
along diagonal areas of the Lexis plane as illustrated in Figure 3.2. The 
formula used was: 



i{t,x) = 



E{t,x) + d (i,x) + E{t -\,x-\)-d {t,x - 1) 



Calculation can now be made of numerical estimates of the time- and age- 
specific probabilities of death: 



q{t,x) =1- 



l(j + \,X->r\)l l{t,x) 



and the empirical mortality surface: 



y{t,x) =logit^(Cx) = log(odds at (t,x)). 



(3.4) 



Figure 3.2. Lexis diagram of population data along a cohort 
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It is important that y(t,x) be examined in some detail. To conserve space 
some plots are given for males only. Figure 3.3 is the empirical mortality 
surface for males when 1850 <i< 1990 and l<j£:<100. Along cross 
sections when t is fixed or along diagonals, one observes the well-known 
presence of a trough at about age ten, a ridge in the early twenties (less 
pronounced for females), and an apparent linearisation at middle and older 
ages. These features are illustrated in Figure 3.3 for any cross section of male 
mortality at any year t. 

Mortality does not vary uniformly over the (t,x) plane and the advantage of 
plots as in Figure 3.3 is that they facilitate examination of the way mortality 
changes with year and cohort as well as with age. Roughness of the surface 



Figure 3.3. Dutch male observed log (odds), ages 1-100, years 1850-1990 
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indicates volatility and ridges along cross sections at given years mark brief 
episodes of excess mortality such as the influenza epidemic of 1918-19 and the 
famine of the winter of 194445. Clearly a new mortality regime came into 
being with the end of the Second World War. Infant and childhood mortality 
continued to fall but the picture for changes at adult ages is mixed. Volatility 
diminished after the war as can be seen from the relative smoothness of the 
surface after 1946. Figure 3.4 exhibits the divergence between the sexes of 
postwar adult mortality with that for females generally continuing the pre-war 
trend, but with middle-aged males exhibiting a hump (coronary deaths?) that 
peaked about 1970. It is of cmcial importance that these plots be subjected to 
careful scmtiny before proceeding with the formulation of a model. An 
immediate inference is that the sexes should be modelled separately. 

Selection of a parametric model for 5 (P;/,a:) is an essentially subjective 

process requiring a balance between parsimony and the inclusion of a 
sufficient number of parameters to adequately describe apparently interesting 
feamres. For a single year, or single cohort, Heligman and Pollard (1980) 
model an age curve of mortality by a formula composed of parts describing 
logit q{t,x) in childhood, in early adulthood, and thirdly for older ages. The 
formula contains eight parameters. These are clearly not constant with time or 
cohort and the fit of a time-varying Hehgman-Pollard formula to an empirical 
mortality surface such as that in Figure 3.3 is likely to involve the estimation 
of a large number of parameters. The difficulties are not insurmountable but 
rather than embark on such an undertaking we have chosen to investigate an 
algebraically simpler but still fairly detailed model over a restricted set of ages. 

The set of ages to which the discussion will now be confined is that for which 
a Gompertz-like model is plausible, namely for ages above about 40. There is 
then the advantage of linearity in the elements of the coefficient vector P . 

Further, the discussion is for the century 1890 to 1990. This time-span seems 
more than adequate for the purposes of modelling and has the additional 
advantage of eliminating most of the data points at older ages from eighteenth- 
century cohorts that appeared unreliable. In fact, as the modelling progressed 
it became clear that there were also difficulties with some of the more recent 
these reasons, in what follows attention is restricted to fitting a formula to the 
empirical mortality surface y{t,x) of (3.4) over the part of the Lexis plane 
Z(/,x) defined by: 
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Figure 3.4. Dutch male and female log(odds) for various ages, 1890-1990 
(males bold, females dashed) 



Aa;40 Aa;50 




1890 <f< 
/ = 1920, 
1921 <f< 
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40 




40 


1990, 


40 



< X < 90; 
<x<91; 

< X < 94. 



(3.5) 
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Thus the surface is to be estimated from numerical counts at the 5432 

points (t,x) of L{t, x) . 



Consider now the selection of terms in a model 5(P;f,x) for logit q{t,x) over 
the region L{t,x) of (3.5) which is li near in the elements of (3 . Reference to 
Figure 3.3 suggests li near and perhaps quadratic terms in t and x, and in fact 
the powers selected as predictors were x,t,x^ and tx . This gives the 
preliminary version: 

5(P;f,x)= Po + P,jc + P2f + P3-^^ +P4^-^ 

- (Po + P 2 O+ (Pi + P4^^ + Ps-^^ 



which models logit q{t,x) as a quadratic in age x with two coefficients l in ear 
in time t. Recall that the upper age bound is x = 94, due to data limitations, so 
no attempt is being made to model mortality for the oldest old. Next, to 
account for the ridges at t = 1918 and t = 1945 introduce the indicators (dummy 
variables): 



and 



1(1918) = 



if year = 19.18 
otherwise 




if year = 1945 
otherwise 



Further, to describe the changed regime after the war introduce: 



I(>1945) = 



if year > 1945 
otherwise 



and its interaction with the polynomial of the preliminary model. We therefore 
have the improved version: 



8(p;i,x)= p„ + p,x + p,i + P 3 x"+p,ix + P5l(l918) + p,l(l945) 
+(p? + Ps-^+ p9^ + PlO-^^ + Pl/-^)l(> 1945 ) . 



(3.6) 
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This is the basic descriptive model for both Dutch male and female mortality 
over the (time, age) region (3.5). Additional parameters can be added to 
describe other features of the empirical surface (3.4) but at the cost of 
increasing complexity, although it must be said that the complexity becomes 
one of interpretation rather than of computation. 



Separate models were fitted for males and females. The one for males 
contained 28 predictors, including those introduced in the previous paragraph. 



Of particular interest is the predictor dnorm= 







in 



the model for male mortality. This was included to at least partially account 
for excess male mortality in the sixties and seventies (see Figure 3.4) due 
presumably to coronary heart disease, and took the form of the ordinate of a 
normal density centred at 1970 with standard deviation 8. Thus dnorm(year, 
1970, 8) gives most weight to the ridge along t = 1970 with neighbouring years 
weighted according to the above normal density. Entries for ages 60 and 70 in 
Figure 3.4 show the shape of the bump that one is attempting to model. 
Furthermore, the shape varies with age and this is treated by including the 
interaction (age)^ xdnorm (it emerged that the first order interaction agexdnorm 
was not significant). 

Incidentally, it is worth commenting that no such coronary hump was observed 
in a si mi lar analysis of French male mortality. Presumably this reflects dietary 
and/or drinking habits. 

The predictors finally included in the models appear in the left-hand columns 
of Table 3.1 and Table 3.2 of the next section. As mentioned, a predictor is 
meant to reflect an interesting feature of the mortality surface. For example, 
1(1917, 1919) is included to account for effects due to the Spanish Influenza 
and also the latter stages of the First World War, and so on. Whether it was 
really necessary to include so many predictors is a moot point but our decision 
as modellers was to err on the side of perhaps too detailed a description. Thus 
both male and female mortality are modelled by surfaces of the form: 

logit9(/,^) =6(P;t,x) = po + , 



(3.7) 
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in which X,. denotes the ith predictor and p = 28 for males and p = 24 for 
females. Apart from dnorm all predictors are low-order powers of t and x. 

2q 

Observe that the approximation m and (3.7) imply the logistic 

(2-q) 



expression: 









m{t,x) = 2 


1 -h 2 exp 


- P.+iu.-r, 

V i=i J \ 



for the central death rate. 



3.4 I A Descriptive Model for Mortality in the Netherlands 

Application of the procedure described at the end of Section 3.2 to the male 
and female versions of (3.7) gave the results displayed in Table 3.1 and Table 
3.2. 

Table 3. 1 deals with the fitted surface: 

28 

8(P;t,^) = P>£p>, (3.8) 

/=1 

describing male mortality. Here A, , A 2 ,..., denote the 28 predictors 
which, together with the intercept, are listed in the first column (variable) of 
the table. The estimates Pg,P,,...P 2 g appear in the second column. Calculation 
of the standard errors given in the third column was by a S-plus routine 
inflated by , where (j) =2.87431 is the extrabinomial factor (3.3) for 
Dutch males. Ignoring this factor would have given standard errors that were 
about 1.7 times too small, thus leading to spurious accuracy. 

Table 3.2 gives the corresponding results for a model of female mortality with 
24 predictors. These 24 are included in the model for male mortality, the four 
missing ones being dnorm, dnorm x , 7(1944) x x and 7(1945), which 
were found to be not significant. The extrabinomial factor of (() - 3.85372 is 
incorporated in the standard errors of the third column. 
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Table 3.1. Dutch males. Results for the fit to 



28 

5(P;/,jt) = po + (t,x) in (3.5) 

1=1 



Variable 


Value 


Standard error 


t-value 


Intercept 


6.275499' -t-01 


2.143 


29.28 


age 


-7.969149'-01 


0.031 


-25.71 


year 


-3.657961'-02 


0.001 


-32.60 


age^ 


4.526471e-04 


1.61e-05 


28.16 


age: year 


4.325018e-04 


1.63e-05 


26.58 


1(1890-1893) 


9.161620e-02 


0.013 


7.10 


1(1899) 


1.266344e-01 


0.021 


6.00 


1(1911-1915) 


-2.611437-02 


0.010 


-2.60 


1(1917,1919) 


1.948810e-01 


0.079 


2.46 


1(1918) 


8.357976e-01 


0.099 


8.45 


1(1922-1925) 


-9.075 166'-02 


0.011 


-8.13 


1(1931-1937) 


-6.305437'-02 


0.010 


-6.44 


1(1940-1942) 


2.838397e-01 


0.065 


4.40 


1(1943) 


5.975377e-01 


0.010 


6.00 


1(1944) 


1.637074' -1-00 


0.083 


19.68 


1(>1945) 


-7.05880' -1-01 


2.934 


-24.06 


1(1917, 1919):age 


-2.589723'-03 


0.001 


-2.21 


l(1918):age 


-9.443955'-03 


0.001 


-6.37 


l(1940-1942):age 


-3.253200'-03 


0.001 


-3.51 


l(1943):age 


-7.656461'-03 


0.001 


-5.30 


l(1945):age 


-1.493579'-02 


0.001 


-10.42 


1(> 1945):age 


9.160680e-01 


0.042 


21.94 


1(> 1945):age' 


-6.103053'-04 


2.05e-05 


-29.79 


1( > 1945):year 


3.492834e-02 


0.002 


23.05 


1( > 1945):age:year 


-4.302857-04 


2.16e-05 


-19.90 


1(1945) 


1. 078755' -t-OO 


0.098 


11.01 


l(1944):age 


-1.726907'-02 


0.001 


-14.07 


dnorm(year, 1970,8) 


7.800986'-!- 00 


0.400 


19.52 


dnorm:age^ 


-1.077510'-03 


7.45e-05 


-14.46 
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Table 3.2. Dutch females. Results for the fit to 

for in (3.5) 



Variable 


Value 


Standard error 


t-value 


Intercept 


3.884869^-1-01 


2.5745 


15.09 


age 


-5.141349'-01 


0.037 


-13.96 


year 


-2.395037'-02 


0.001 


-17.80 


age^ 


5.371434e-04 


2.08e-05 


25.84 


age: year 


2.788697e-04 


1.93e-05 


14.46 


1(1890-1893) 


1.186390e-01 


0.017 


6.91 


1(1899) 


1.576229e-01 


0.028 


5.67 


1(1911-1915) 


-3.14772P-02 


0.013 


-2.37 


1(1917,1919) 


2.309600e-01 


0.104 


2.22 


1(1918) 


6.592334e-01 


0.132 


4.98 


1(1922-1925) 


-7.236351'-02 


0.015 


-4.98 


1(1931-1937) 


-6.340533'-02 


0.013 


-4.95 


1(1940-1942) 


-2.652525'-01 


0.085 


-3.13 


1(1943) 


-1.358712'-01 


0.134 


-1.01 


1(1944) 


1.771916e-01 


0.024 


7.51 


I(>1945) 


-2.27391 r-i-01 


3.905 


-5.82 


1(1917, 1919):age 


-3.351955'-03 


0.002 


-2.20 


I(1918):age 


-7.162427'-03 


0.002 


-3.67 


I(1940-1942):age 


3.430578e-03 


0.001 


2.85 


l(1943):age 


1.931614e-03 


0.002 


1.01 


l(1945):age 


-1.217961'-03 


3.69e-04 


-3.30 


I(> 1945);age 


6.601393e-01 


0.054 


12.27 


I(>1945):age' 


-1.031750=-04 


2.76e-05 


-3.75 


I(> 1945):year 


1.104260e-02 


0.002 


5.48 


I( > 1945):age:year 


-3.277412'=-(M 


2.79e-05 


-11.76 



Figure 5.5 is a graphical display of the surface summarised numerically in 
Table 3.1. Highlights are the ridges at years of excess mortality and changes in 
trend and level at about 1946. Also to be noted is the contribution of dnorm as 
a smoothed ridge centred along t = 1970, a feature not present in the surface 
for females (which, to conserve space, is not shown). 
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Figure 3.5. Plot of the fitted mortality surface of Dutch males (see Table 3.1) 
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It is differences in pre- and post-war trends that warrant some discussion. 
Whilst pre-war mortality patterns are generally si mi lar for males and females 
this is certainly not the case after 1945. Female mortality continues to 
decrease, if anything at an increasing rate at older ages, whereas male 
mortality flattens out and even temporarily increases after 1946. A comparison 
of male and female post-war mortality can be formalised by examining the two 
fitted mortality surfaces. For males aged 40 < x < 94 in the years 
1946</<1990, the indicators before 1946 have no effect and Table 3.1 
gives the following fitted surface: 

8 ^ (P; t,x) = 62.755 - 0.7969x - 0.03658f + 0.0004526x^ 

+0.00042>25tx + (7.801 - 0.001078x' )dnorm 

+(-70.588 + 0.9161x + 0.03493t- 0.0006103 lx' - 0.0004303tx)/(> 1945) 

= -7.833 + 0.1 192x - 0.00165/ - 0.0001 5771x' + 0.0000022/x 
+ (7.801 - 0.001078x')Jnor/w . (3.9) 



Recall that: 
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\f t-\91Q 

2V 8 

so that its effect is negligible outside the interval 1950 to 1990 approximately. 
Also recall that results for cohort c can be obtained by the substitution 
t - c + X . 

Comment should also be made on the fact that the coefficient of in (3.9) is 
negative (and significantly different from zero), implying that 

5^(P;t,x) -> —00 as X — > 00 . However, the model is clearly inappropriate 

for very large x, say x> 130, and in fact calculations show that 6 ^(p;t,x) is 
increasing in x for all ages up to at least 150. 

Corresponding to (3.9), the fitted post-war surface for females 40 and over is, 
from Table 3.2, 

5 ^ (P;t,x) = 16.1 10 + 0.1460X - 0.01291t + 0.0004339x^ - 0.0000488tx (3.10) 

The coefficients of t and tx are significantly different from zero and there is a 
steady downward trend with time which, because of the tx interaction, is also 
stronger with increasing age. 

It should be mentioned that the linear and cross product terms in year t in (3.9) 
are statistically not significantly different from zero and hence post-war male 
mortality is not changing significantly with time, apart from the smoothed 
ridge centred along t = 1970. This has relevance to forecasting since it raises 
the issue as to whether or not the terms in t and tx should be included if the 
surface is extrapolated to say t = 2030. Our response has been to include these 
terms. One reason is that extrapolation is frequently carried out by cohort and 
hence it is important to use the complete surface (3.8), which was based on 
calculations for t > 1890. Secondly there is the intuitive consideration that flat 
male mortality for the next 40 years from 1990 is simply not feasible. This is 
an instance where non-statistical issues must be taken into account, and the 
issue is taken up in the next chapter. 




dnorm= — F=exp 

8-v/^ 
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4. Forecasting Mortality from 
Regression Models: the Case of 
the Netherlands 



Christopher HEATHCOTE and Tim HIGGINS 

Abstract 

This chapter continues the discussion of Chapter 3 in which regressions for 
male and female mortality for ages 40 and above in the Netherlands were 
estimated. A naive forecast of mortality can be obtained by extrapolation. 
However, a plausible forecast may require modification of the fitted model to 
obtain what is called a predictive model. A model of this sort is described and 
the resulting forecasts of period and cohort life expectancy and log(odds) are 
compared with those produced by extrapolations of the descriptive model. 
Period and cohort life expectancy for selected ages above 40 and for selected 
years to 2050 are given. A final section discusses the forecasts obtained from 
regression models using the perspective of the official national forecasts of 
Dutch mortality. 



4.1 I Descriptive and Predictive Models 

This chapter should be read as a continuation of the previous one. Frequent 
reference will be made to earlier results and in an obvious abbreviation we will 
write for example Equation (3.9) for the formula numbered (3.9) in Chapter 3, 
and so on. 

In the absence of compelling views about future mortality the natural forecast 
to examine first is the naive one obtained by extrapolating the past. Following 
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directly on from the previous chapter, and in the same notation, fitted 
regressions to mortality surfaces will be taken as apt descriptions of past 

mortality and extrapolated values of 5(p;/,x) can be found by substituting 

appropriate values for year t. It will emerge that in the case of the Netherlands 
the extrapolation of a descriptive model is not satisfactory and a predictive 
model must be developed. 

It is useful to distinguish between a fitted descriptive model, which is a 
formula as in Equations (3.9) and (3.10) in Chapter 3, and the associated 
predictive model from which numerical forecasts are obtained. The two 
models will be the same if a forecast is obtained by extrapolating the 
descriptive model. However, on occasion there may well be good reason to 
believe that changes in future circumstances will influence mortality and 
therefore a predictive model must be obtained by modifying the descriptive 
one. This can be done by appropriately altering the numerical values of the 
fitted regression coefficients in the descriptive mortality surface 5(P;t,x) , 

estimated from historical data, with the magnitude of the alteration determined 
by views about the future. 

The fact that a descriptive model estimated from hi storical data must of 
necessity be modified for predictive purposes is implied by the lack of 
stationarity in demographic processes. It is often not clear what modifications 
are appropriate to produce a predictive model that is adequate to deal with 
future mortality in an environment changing in a largely unknown way. An 
implication is that intuitive notions about the future inevitably play an 
important and unavoidable role in forecasting mortality. Hence the reasoning 
used to obtain a predictive from a descriptive model must be made explicit, 
even if some suppositions are of dubious validity. We have attempted to do 
this in Section 4.3. 

Descriptive models of hi storical data are of interest in their own right, but if 
the primary purpose of fitting such models is to obtain a starting point for 
forecasting then the question arises as to how far back in time one should go. 
In other words, how many historical data should be used? This question is a 
luxury that can only be posed in the case of developed countries with extensive 
records. Even so, common practice in statistical agencies is to use data from 
only a few recent years when making a short-term forecast, with longer term 
forecasts requiring trends from a commensurate longer historical period. On 
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the other hand, time series and regression models generally require extensive 
data for their satisfactory estimation, even if the aim is short-term prediction, 
but this presents no drawback since the methodology and computational 
procedures are well established. 

For many countries, including the Netherlands, the Second World War marks 
a watershed in the evolution of mortality (amongst other things). Wartime and 
post-war advances in medicine and public health suggest that developments 
after 1946 may be more informative about future mortality than pre-war data. 
Yet it may be a mistake to ignore earlier statistics since they could suggest at 
least certain qualitative relationships or comparisons that only become 
noticeable when long-ran data are scmtinised. In fact it is almost invariably the 
case in statistical modelling that all available historical data are used in the 
estimation process. Typically all parts of the data set are given equal weight 
even though these parts may not all be of equal relevance to the forecasting 
enterprise, such reservations being brought to bear when deriving a predictive 
model. In the case of the Netherlands, as set out in the following sections of 
this chapter, the fitted mortality surfaces (3.9) and (3.10) are taken as the basic 
descriptive models of male and female mortality respectively for ages 
40 < X < 94 and years 1946 <t< 1990. Note that all data from 1890 on have 
been used to estimate these surfaces with the relative contribution of post-war 
effects appearing in the model as the factor of the indicator I( > 1945). 

In the next section it is argued that for mortality in the Netherlands 
extrapolation leads to forecasts that are not plausible. Section 4.3 develops a 
predictive model and presents our forecasts of life expectancies for ages 40 
and above to the year 2050. The main feature of this predictive model is that it 
preserves the roughly parallel historical decrease in male and female mortality 
that was perturbed by the excess male mortality centred about 1970. The final 
section presents a brief discussion of the forecasts made in the context of the 
view of Dutch national experts. 



4.2 I Forecasting Dutch Mortality from a Descriptive Model 

Extrapolation of a descriptive model may well give satisfactory short-term 
forecasts but it is clear that extrapolation to a medium- or long-term horizon 
(i.e. 40 years) is fraught with hazard. Most of the numerical results below use 
historical data for 1890-1990 to make statements about Dutch mortality to the 
year 2030. So, some caution is necessary when dealing with these outcomes. 
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For Dutch males aged 40-94, Figure 4. 1 plots the empirical mortality surface 
for 1946 <^< 1990 and an extrapolation to ^ = 2030 of the fitted surface 
(3.9). The corresponding plot for females is similar in shape. Apart from 
dnorm, which has negligible effect after about 1995, the extrapolated plots are 
smooth, gently sloping quadratics and in themselves are of only mild interest. 
It is derived quantities such as life expectancies that are more informative. The 
formula used for life expectancy calculations is, in an obvious notation. 



0 

6x 







where the summation is over all ages equal to and older than age x in an 
individual year for period life expectancies, and all ages equal to and older 
than age x along the diagonal of the Lexis plane for cohort life expectancies. 



Figure 4. 1 . Observed and extrapolated post-war log (odds) of Dutch males. 
Ages 40-94, years 1946-2030 
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Figure 4.2 shows that the difference in mortality between the sexes reaches 
dramatic and perhaps implausible levels when extrapolated to the year 2030. 
Male cohort and period life expectancy at ages 60 and 80 remains roughly 
constant in time between 1946 and 2030, whereas the corresponding 
expectancies for females increase steadily. On the other hand, pre-war life 
expectancies tend to be parallel and close together, the exception being the 60- 
year-old female cohort life expectancy whose early date of increase reflects the 
beneficent post-war mortality regime enjoyed by females. 

Figure 4.3 uses plots of log(odds) = logit for selected t and x to 
illustrate the same point. The slow rate of decrease (if any) in the extrapolated 
male curves contrasts sharply with the noticeable steady decrease in the female 
case. Reference back to Figure 3.5 in Chapter 3 and comparison with Figure 
4.3 illustrates once more the different pre- and post-war mortality regimes 
holding for the sexes. The source of the difference is the hump of excess male 
mortality roughly centred around 1970. It can be seen clearly in the male 
curve for age 60 in Figure 4.3. The effect of this hump is to destroy the pre- 
war parallelism of male and female logit q(t,x) and to induce increasingly 
discordant projections. 



Figure 4.2. Period and cohort life expectancy from fitted and extrapolated mortality 
surfaces. Dutch males and females at ages 60 and 80 




year 
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Figure 4.3. Observed, fitted and extrapolated log (odds) based on descriptive models. 
Dutch males and females at ages 60 and 80 




Following the peak of male mortality around 1970 there appears to be a steady 
fall which for some ages flattens in the 1980s, but seems to continue for 
others. Does this indicate a realignment of male and female experience? More 
generally, if the hump is a one-off phenomenon then perhaps it should not play 
such an influential role and a predictive model should take this into account. 

A similar picture emerges from the numerical results presented in Table 4.1 
and Table 4.2 Standard errors for the period and cohort life expectancies are 
shown in brackets under the appropriate entries in the tables. The formula used 
for the standard errors is (4.10) on page 163 of Chiang (1984). These two 
tables and Figures 4.2 and 4.3 pose in an acute way the dilemma faced when 
basing a forecast on extrapolation — does one really accept the apparent 
increasing divergence between male and female mortality? 

Difficulties with extrapolation due to the hump of male post-war excess 
mortality have already been discussed. It can be argued that the slope of the 
extrapolated surface for females may also be inappropriate. The linear fit to 
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Table 4.1. Observed (1990) and predicted (t>\99\) period life expectancies for Dutch 
males and females. Extrapolations of the descriptive models (3.9) and (3.10). Standard 



errors shown in brackets 


age 


1990 


2000 


2010 


2020 


2030 


Males 












40 


35.42 


35.61 


35.75 


35.89 


36.03 




(0.06) 


(0.06) 


(0.06) 


(0.06) 


(0.06) 


50 


26.26 


26.43 


26.56 


26.69 


26.82 




(0.06) 


(0.06) 


(0.06) 


(0.06) 


(0.06) 


60 


18.01 


18.15 


18.27 


18.38 


18.50 




(0.05) 


(0.05) 


(0.05) 


(0.05) 


(0.05) 


70 


11.22 


11.32 


11.41 


11.50 


11.59 




(0.04) 


(0.04) 


(0.04) 


(0.04) 


(0.04) 


80 


6.33 


6.39 


6.43 


6.51 


6.57 




(0.04) 


(0.04) 


(0.04) 


(0.04) 


(0.04) 


90 


3.33 


3.36 


3.39 


3.43 


3.46 




(0.05) 


(0.05) 


(0.05) 


(0.05) 


(0.05) 


Females e^p 












40 


41.79 


43.21 


44.63 


46.05 


47.48 




(0.07) 


(0.07) 


(0.07) 


(0.07) 


(0.07) 


50 


32.33 


33.69 


35.06 


36.43 


37.82 




(0.06) 


(0.07) 


(0.07) 


(0.07) 


(0.07) 


60 


23.34 


24.59 


25.87 


27.16 


28.46 




(0.06) 


(0.06) 


(0.06) 


(0.06) 


(0.06) 


70 


15.23 


16.30 


17.41 


18.55 


19.72 




(0.05) 


(0.05) 


(0.05) 


(0.05) 


(0.05) 


80 


8.61 


9.42 


10.27 


11.16 


12.09 




(0.04) 


(0.04) 


(0.04) 


(0.04) 


(0.04) 


90 


4.09 


4.57 


5.09 


5.66 


6.28 




(0.04) 


(0.04) 


(0.04) 


(0.04) 


(0.04) 
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40 


6.37 


7.60 


8.88 


10.16 


11.45 


50 


6.07 


7.26 


8.50 


9.74 


11.00 


60 


5.33 


6.44 


7.60 


8.78 


9.96 


70 


4.01 


4.98 


6.00 


7.05 


8.13 


SO 


2.28 


3.03 


3.78 


4.65 


5.52 


90 


0.76 


1.21 


1.70 


2.23 


2.82 
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the female log(odds) in Figure 4.3 is based on an ‘average’ slope for the 
log(odds) between 1946 and 1990. Extrapolations of the descriptive model to 
2030 are formed using this slope. Figure 4.3 suggests that the decrease in 
female mortality since 1970 or 1980 has slowed for some ages. It is plausible 
that this trend continues and that future rates will not decrease as quickly as 
suggested by extrapolations of the descriptive model. A predictive model 
needs to incorporate considerations such as these. 



4.3 I Forecasting Dutch Mortality from a Predictive Model 

It is apparent that there should be serious concern about extrapolating the fitted 
model beyond 1990 to obtain a forecast since male mortality seems to decrease 
too slowly and female mortality perhaps too quickly. If this reservation is 
accepted, alternatives must be sought. One is to specify an outcome at some 
future time, for example, specified life expectancies or probabilities of death at 
say t=2050, and then obtain a predictive model from a fit constrained by this 
outcome. A second procedure is to modify a fit to historical data by 
eliminating (or ameliorating) the effect of specific causes of death, such as 
coronary heart disease 1950-1990, and/or extrapolate a fit modified by an 
“improvement factor”. More generally, a predictive model can be obtained by 
changing values of coefficients in the descriptive model, provided the reasons 
for doing so are transparent and persuasive. 

At risk of labouring the point, suppose the descriptive model of a mortality 
surface is 

d(t,x) =d(, +d,/ 

where the coefficients d ^ , d , depend on age x. For fixed x, 

d , = 5(t -t- l,x) -6(t,x) is the annual change in the description of historical 
mortality. A predictive model is obtained by altering the numerical value of 
d , (and perhaps also d o ) to accord with beliefs about the future. To examine 
the question from a different angle, suppose forecasts are to be obtained under 
the assumption of a proportionate decrease in the time- and age-specific 
probability of death. Thus, suppose that for fixed tj, , x^ and s. 
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q{t,+s,x,)=y.q{to,x^). 

That is, for given age Xq , it is assumed that the probability of death in year 
tg + 5 is the fraction y of what it was in year tg . Then the odds of death 
become 



q{to+s,Xo) 

l-q{to +s,^o) 



y-^{to,Xo) ^ g{to,Xq) y .(l-q{to,x,)) 
l-y.q{tg,x„) \-q{t,,Xg) l-y.q{t,,x,) 



In terms of the mortality surface 8(t,A:) = logit q{t,x) = log 



q{t,x) 

\-q{t,x) 



8(/o =8(tg,Xg) + l0g 



y.(l-g(to,Xo)) 

l-y.^(to,Xo) 



The assumption about the probabilities of death translates into this constraint 
on the linear regression function h{t,x) and the regression coefficients must 
be modified accordingly. 

Following this line of thought, a predictive model based on what seems to be a 
set of reasonable propositions can be proposed. Suppose the post-war coronary 
hump modelled by dnorm is an aberration which disturbed the generally steady 
decrease in male mortality. In particular it broke the nexus with female 
mortality as illustrated in Figure 3.4 in Chapter 3. The pre-war downward 
trend of male and female mortality was almost the same so if dnorm modelled 
a singular event, affecting males only, then it is plausible that the trends will 
once more tend towards parallelism. That is to say, the descriptive model (3.9) 
for males should be modified for predictive purposes by decreasing the slope 
with respect to time t. On the other hand, a slight flattening of female mortality 
relative to the long-term trend, commencing about 1970 (see Figure 4.3) 
suggests that the steady downward movement of female mortality may be 
becoming attenuated. From Equations (3.9) and (3.10) the coefficients of time 
t in the post-war male and female models are respectively -0.001652 and 
-0.012908, differing by a factor of nearly eight. The average is -0.00728 and 
the discussion above leads to the following proposal: 




Table 4.2. Cohort life expectancies of Dutch males and females predicted from the descriptive models (3.9) and (3.10). 

Standard errors are shown in brackets 

1940 1950 1960 1970 1980 1990 2000 2010 2020 2030 
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8.23 9.11 10.01 10.95 11.94 

(0.05) (0.05) (0.05) (0.05) (0.05) 
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(Table 4.2. (end)) 
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Predictive models of mortality for t>1991 are as in Equations (3.9) and 
(3.10) modified by taking the coefficient of t as -0.00728. When t < 1990, 
Equations (3.9) and (3.10) remain unchanged. 

It is not proposed to alter the time-age interaction coefficient, so the rate of 
decrease of mortahty is not exacdy the same for the two sexes. Eor 
completeness, and slightly changing notation, we write out the male and 
female predictive models for the mortality surface 5(|3;t,x) = logit ^(t,x) to 

be used for forecasts beyond 1990. Continuity is maintained by adjusting the 
intercept to give equality of the fitted and predictive models at t=1990. Eor 
males and females respectively and 1991 < f,40 < x , 

8 ^ x) = 3.3707 + 0.1 1 9x - 0.00728i - 0.0001 577x^ + 0.0000022tx 
-H (7.801 -0.001078x2)r/«ora (4.1) 

bp{t,x) = 4.8963 + 0.1460X - 0.00728/ + 0.0004339x^ - 0.0000488/x . (4.2) 

Thus the male and female time- and age-specific probabilities of death 
proposed to constmct forecasts are: 





1-t-exp 




qp{t,x) = 


1-t-exp 


-5^(/,x)j 



and these are the quantities that form the basis for the forecasts presented in 
this section. 

It should be made clear that arguments can be made for other adjustments that 
contribute to an alignment of the rates of change of mortality. Eor example 
one could envisage slopes similar to those holding before the war, or some 
variant on that theme. However, the virtual disappearance of epidemics (apart 
from the coronary one for males) and generally altered health conditions by 
the end of the twentieth century suggest that the Netherlands will experience a 
continuing dec lin e in mortality, perhaps at a slower rate, and that it is the 

approximate parallelism of the slopes of 8 (/,x) and 6 ^(/,x) at a moderate 
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value that is important. Under these circumstances we believe that the 
predictive model postulated above is plausible. 

Figure 4.4 plots logit q{t, 60) and logit q(t,iO) using observed data to 1990 
and the predictive model thereafter. In contrast to the plots of Figure 4.3, male 
and female predictions diverge only slightly. Figures 4.3 and 4.4 provide a 
graphical summary of our views on forecasting Dutch mortality; the widening 
gaps in the extrapolations of Figure 4.3 after 1990 are unlikely, whereas the 
estimates of logit^(t,x) depicted in Figure 4.4 form a reasonable basis for 
forecasting life expectancies and other measures of interest. 

Table 4.3 displays male and female period life expectancies using (4.1) and 
(4.2). They should be compared with the life expectancies in Table 4.1 
obtained by extrapolating the fitted model. Observe the increase in male life 
expectancy and the simultaneous decrease in that for females. In particular the 
maximum female-male difference of 11.45 years in 2030 of Table 4.1 is 
replaced by the more plausible value of 7.41 in Table 4.3. 



Figure 4.4. Observed and predicted log (odds) based on predictive models. 
Dutch males and females at ages 60 and 80 
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Table 4.3. Observed (1990) and predicted (t>\99\) period life expectancies for Dutch 
males and females. Extrapolation of the predictive models (4.1) and (4.2). Standard 
errors similar to those in Table 4.1. 



Age 


1990 


2000 


2010 


2020 


2030 


2040 


2050 


Males 


40 


35.42 


36.14 


36.81 


37.47 


38.14 


38.80 


39.47 


50 


26.26 


26.93 


27.56 


28.19 


28.82 


29.45 


30.09 


60 


18.01 


18.59 


19.15 


19.70 


20.27 


20.83 


21.41 


70 


11.22 


11.67 


12.11 


12.56 


13.02 


13.48 


13.95 


80 


6.33 


6.62 


6.93 


7.24 


7.56 


7.89 


8.23 


90 


3.33 


3.50 


3.67 


3.85 


4.05 


4.25 


4.46 


Females Cfp 


40 


41.79 


42.73 


43.66 


44.60 


45.55 


46.51 


47.46 


50 


32.33 


33.23 


34.13 


35.04 


35.96 


36.88 


37.81 


60 


23.34 


24.17 


25.01 


25.86 


26.72 


27.59 


28.47 


70 


15.23 


15.94 


16.67 


17.42 


18.18 


18.95 


19.74 


80 


8.61 


9.15 


9.70 


10.28 


10.88 


11.49 


12.13 


90 


4.09 


4.41 


4.75 


5.11 


5.50 


5.90 


6.32 


^FP'^MP 


40 


6.37 


6.59 


6.85 


7.13 


7.41 


7.71 


7.98 


50 


6.07 


6.30 


6.57 


6.85 


7.14 


7.33 


7.72 


60 


5.33 


5.58 


5.86 


6.16 


6.45 


6.76 


7.06 


70 


4.01 


4.27 


4.56 


4.86 


5.16 


5.48 


5.79 


80 


2.28 


2.53 


2.77 


3.04 


3.32 


3.60 


3.90 


90 


0.76 


0.91 


1.08 


1.26 


1.45 


1.65 


1.86 



Forecasts of cohort life expectancies are in Table 4.4 and they should be 
compared with the corresponding (time, age) entries of Table 4.2 based on 
extrapolation, and also to the period expectancies (see also Figure 4.6). 
Forecasts of cohort expectancies for certain years and ages (e.g. age 50 in 
2000) depend to an increasing extent on values of the predictive models in the 
distant future, perhaps up to the year 2070, and should therefore be treated 
with caution. 




Table 4.4. Cohort life expectancies for Dutch males and females obtained from extrapolation of the predictive models (4.1) and (4.2). 

Standard errors similar to those in Table 4.2. 

1940 1950 1960 1970 1980 1990 2000 2010 2020 2030 
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4.35 5.31 5.74 6.11 6.43 

3.38 4.02 4.47 4.79 5.12 

1.99 2.49 2.75 3.04 3.33 

0.81 0.98 1.15 1.34 1.55 
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Figure 4.6 and Figure 4. 7 illustrate differences of selected quantities calculated 
by the predictive and the extrapolated model. The effect of the coronary hump 
(dnorm) is clear; as shown in Figure 4.6, it caused a sharp divergence in male 
and female life expectancies in the post-war years with parallelism recovered 
by the predictive but not the descriptive model. The difference between models 
is not so dramatic for conditional survival probabilities, as shown in Figure 
4.7. Life expectancies estimate the areas under such curves so discrepancies 
accumulate. If plausible forecasted life expectancies, and a plausible forecast 
of differences between the sexes are taken as criteria, then we believe that the 
predictive model proposed here is acceptable. It uses historical information in 
a technically efficient way and its regression coefficients are obtained by 
clearly stated arguments. 



4.4 I Discussion 

One of the advantages of basing forecasts on li near regression models with 
normally distributed disturbances is that formulae for standard errors and 



Figure 4.5. Period and cohort life expectancy from fitted and predicted mortality 
surfaces. Dutch males and females at ages 60 and 80 






4. Forecasting mortality from regression models: the case of the Netherlands 



99 



Figure 4.6. Period life expectancy calculated from descriptive and predicted mortality 
surfaces. Dutch males and females at ages 40, 60 and 80 







Figure 4. 7. Probability of survival to age x given age 40 in 1970 (1930 birth cohort) 
based on descriptive and predictive models. 

Dutch males and females 
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confidence intervals are known and often incorporated into computational 
packages. Thus if extrapolation of a fitted regression is used to forecast the 
future, upper and lower confidence intervals of the extrapolated regression can 
be used to obtain upper and lower forecasts of quantities such as life expec- 
tancies calculated from the mortality surface. Even so, we believe that 
extrapolation of the descriptive models (3.9) and (3.10) for males and females 
together with upper and lower 95 per cent confidence intervals lead to 
forecasts that are not plausible. Further, we are inc lin ed to reject such a 
constmction for a predictive model since it is firmly rooted in the past and 
leaves no room for opinions about the future. Recall that demographic 
processes are inherently non-stationary and the central problem of forecasting, 
according to our methodology, is to bring judgements about the future to bear 
to obtain a predictive model by altering the parameters in what is believed to 
be an apt description of the past. 

A reason for the non-stationarity of mortality data is that the causes of death 
that are significant are not constant over time. For example, particularly in 
developed countries, contagious diseases have lost much of their importance in 
post war years and cancers and heart disease have come to play a dominant 
role. The surge in post-war male mortality due largely to coronary heart 
disease is included in our models by the regressor dnorm and, for example. 
Figures 3.4 and 3.5, Chapter 3, illustrate its rise to about 1970 and fall 
thereafter. Clearly, extrapolations using data up to 1970 would indicate higher 
future male mortality than obtained from the complete data set. Further, 
plausible judgements at that time could well have reinforced the view of non- 
decreasing male mortality. We can now be wise after the event, but the 
appearance of a rising post-war trend would have posed considerable 
difficulties for regression forecasters around 1970. A suggestion we have for 
investigating non-stationarity and a way of studying the evolution of groups of 
specific diseases is to model partial mortality surfaces; that is, mortality 
surfaces particular to specific causes of death. In this way it may be possible to 
form a better judgement of the future contribution of such causes to overall 
mortality. This is suggested as a possibly further useful application of 
regression modelling, despite the drawback that it is only aggregate data that 
are available over long periods of time. 

In a later chapter of this volume, and from a different perspective. Van Hoorn 
and De Beer (2000) reach the conclusion that the difference in life expectancy 
between the sexes is decreasing. They base their observation on current 
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empirical mortality rates, a point supported by the observed log(odds) in 
Figure 4.3, and on the assumption of changing epidemiological patterns 
leading to a period life expectancy at birth of 83 for females and 80 for males 
in the year 2050. They point out in their contribution that there is evidence that 
some two-thirds of the current difference in mortality is due to smoking. 
Assuming that smoking behaviour becomes similar, the current difference in 
life expectancy of about 6.5 years should reduce to less than three years. 
Hence the life expectancies of 83 and 80 in 2050 are arrived at through the 
careful weighing of evidence from several sources. That is, the forecasting 
method used is essentially an interpolation between current trends and a belief 
about future epidemiological developments. Their prediction of a reduction in 
the sex differential suggests that life expectancy will increase faster for males 
than for females. For ages over 40 our forecast is not in agreement since, for 
age 60 at least. Figure 4.4 shows that female life expectancy still increases 
faster than that for males, although the difference between the sexes is much 
less than in the extrapolations of the descriptive model shown in Figure 4.2. 
Period life expectancies in 2050 for 40-year-olds calculated using the 
predictive models (4.1) and (4.2) are 39.47 for males and 47.46 for females, a 
difference of about eight years, as opposed to the three years assumed by Van 
Hoorn and De Beer. It is instructive to compare the methodology of Van 
Hoorn and De Beer with ours since the same problem is being approached 
from two distinct disciplinary traditions. Despite the different approaches, and 
adjusting for different definitions, the resulting life expectancies are quite 
similar for males. Life expectancies shown in Table 4.3 were produced up to 
year 2050 for the purpose of comparison with those produced by Van Hoorn 
and De Beer. Relative to their medium assumption of a life expectancy of 83 
inthe year 2050, our predictive model (and certainly our descriptive model) 
for females is overly optimistic when projected so far ahead. It is more in 
accord with their maximum model of an 85-year life expectancy. Implicit in 
the medium Van Hoom-De Beer assumption is the proposition that there must 
be a flattening of the downward 1890-1990 trend in the mortality of older 
women. This is consistent with the notion that improvement in mortality 
cannot be sustained indefinitely, at least not at the rate achieved during the past 
century, possibly because there may be an upper l im it to human life span that 
will be approached by the middle of the 21st century. Judgemental factors 
become all important when forecasting so far ahead and we do not wish to 
dispute the assumptions of Van Hoorn and De Beer. However, we stress that 
predicting from long-term trends (since 1890) in regression models, based on 
the plausible assumption of similar slopes for males and females, leads to the 
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results of Table 4.3 and 4.4, which are not concordant with theirs in certain 
important respects. 

Statistical forecasting is typically short-term with the model and forecast 
updated as new information becomes available. The long time horizons of 
interest to demographers diminish the usefulness of such methodologies in 
forecasting mortality and makes the use of opinions and judgements about the 
future inevitable. Generally, the role of judgemental factors in forecasting 
poses difficult conceptual and practical questions. Alho (1992) proposes to 
formalise the contribution of judgement by using what is called mix ed 
estimation and forecasting. Certain future values, specified by expert 
judgement, of a measure of mortality, are adjoined to historical observations to 
obtain an augmented data set which is used to estimate a statistical model. 
Extrapolation (and interpolation) of the estimated model provide forecasts of 
mortality, with the relative importance of judgemental factors evaluated 
through a partitioning of the variance. This is a programme that could easily 
be implemented in the regression modelling of augmented mortality surfaces, 
since all that is needed is the specification of values for a set of future time- 
and age-specific probabilities of death. More generally, it is feasible to 
estimate a mortality surface subject to a variety of constraints, such as a 
specified fife expectancy at a given date in the future. Again, a difficulty is in 
achieving agreement amongst experts about future values, but it is an approach 
that calls for further exploration and development. 

From a different point of view, Wright et al. (1996) and Bunn and Wright 
(1991) note that there is an increasing interest in the interaction ofjudgemental 
and statistical models. On pages 507-8 of their contribution Bunn and Wright 
give examples, such as weather forecasting under certain circumstances, in 
which what they call judgemental probability forecasting outperforms 
extrapolation of statistical models. However, these examples are of limited 
relevance to mortality forecasting because of the different time horizons 
involved. Causal information can add considerable plausibility to a predictive 
model, though the benefits to the model are contingent on the reliability of the 
causal information. Causal factors that affect mortality include changing 
environmental conditions and social tendencies, as well as medical and 
technological innovation. Changes in these factors will, to a greater or lesser 
extent, impact on the future shape of the mortality surface. Incorporating such 
factors into a predictive model, however, can compound the model’s 
uncertainty if the judgements are poorly founded. In an examination of 
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judgemental forecasting, Lim and O’Connor (1996) note that, ‘W hil e 
adjustment of forecasts using causal information of low reliability did not lead 
to significant improvement, adjustment using highly reliable causal 
information produced forecasts more accurate than the best statistical models.’ 
This begs the question, how does one assess the relevance and reliability of 
epidemiological and other contextual information with respect to the future? 
Ahlburg and Land (1992), on page 292 of their introduction to a special issue 
on population forecasting in the International Journal of Forecasting, observe 
that errors may be self-reinforcing if emanating from a single source. They 
refer to the work of Armstrong (1985) for evidence that averaging the opinions 
of several experts improves the accuracy of judgemental forecasts. Flowever, 
there is no clear answer and, of course, difficulties are compounded by the fact 
that there is such a long wait before most demographic forecasts can be 
verified. 
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Abstract 

A model often used in mortality analysis and forecasting is the Gompertz 
model. The model is attractive because the mathematical specification is 
consistent with theories of ageing. Other models that are used less often share 
some of the useful features of the Gompertz model. Furthermore, the 
Gompertz can be approached as a member of fa mi lies of models that have 
desirable properties. The study of the Gompertz as a member of families of 
models opens new perspectives for mortality analysis and forecasting because 
of the added flexibility in representing mortality profiles. This chapter 
identifies two families of models to which the Gompertz belongs. The first is 
the generalised logistic distribution, which includes the standard logistic 
distribution, the Gompertz and some other distributions that do not seem to be 
directly relevant to mortality analysis. The second family consists of the 
different types of extreme value distributions, the Gompertz being a tmncated 
Type I extreme value distribution. The logistic and extreme value distributions 
have been studied extensively in the literature, in particular in survival analysis 
or duration analysis, and in reliability engineering. Over the years, useful 
properties of these distributions have been discovered that may be used 
effectively in mortality modelling beyond the Gompertz. The future of the 
Gompertz as a unique model seems limited. It is expected that the Gompertz 
will be succeeded by a powerful class of models with great prospects for 
mortality analysis and forecasting; namely, the extreme value distributions. 
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5,1 I Introduction 

Two approaches to mortality forecasting may be distinguished. The first uses 
trend models that extrapolate observed regularities in mortality trends. The 
second relies on process models that capture the main features of causal, 
mainly biological, processes affecting susceptibility to death. A major 
challenge in the application of trend models is the identifrcation of stable or 
stationary patterns that may be considered to continue to hold into the future. 
The identification of patterns may call for disaggregation of the population on 
the basis of risk factors, but the disaggregation is subordinate to the modelling 
of the trend. Process models rely on epidemiological and biomedical theories 
of ageing for the description (modelling) of the causal mechanism. Tabeau et 
al. (1998) reviewed several mortality projection models and gave examples of 
trend models and process models. Time series methods, the age-period-cohort 
method, the Lee and Carter model, and the combined graduation-extrapolation 
approach represent the trend methods. Process models include models that 
view ageing of an organism as an accumulation of defects or failures in 
preclinical form, reducing the ability to maintain homeostasis and hence the 
resistance to death (for an overview of theories of ageing, see e.g. Gavrilov 
and Gavrilova, 1986; Manton, 1993b; Manton et al, 1997; see also Yashin, 
in this volume). This view is operationalised in a number of models, most 
importantly the Gompertz model, the Weibull model (e.g. Manton, 1993a) and 
the avalanche-like destmction model by Gavrilov and Gavrilova (1986). The 
Gompertz is without doubt the best known model, used in a range of 
disciplines from botany to sociology: 

ix(x) = 0 exp / pxj x>0 



where x is age, p(x) the force of mortality at x, and 9 and p are the model 
parameters. The model shows that the force of mortality increases 
exponentially with age at a rate p. The aim of this chapter is to present the 
Gompertz model as a probability model of lifetime data, to review the 
properties of the model, and to compare the Gompertz model to other 
probability models that have useful properties for mortality analysis and 
forecasting. In studies of mortality, the relationship between the Gompertz and 
the Weibull is documented. This chapter demonstrates the relationship between 
the Gompertz and the two distributions that are common in survival analysis: 
the logistic distribution and the extreme value distribution. It is shown that the 
Gompertz distribution is a special case of a generalised logistic distribution. 
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The classical logistic density is symmetric whereas the density of the 
Gompertz distribution is skewed. Most importantly, however, the Gompertz 
model is a tmncated Type I extreme value distribution (cf. also Johnson et ah, 
1995). 

Gompertz (1825, 1872) was the first to develop a model of mortality that 
explicitly relies on an underlying causal mechanism. The mecha ni sm is 
described in general terms only. In fitting a model to the age profile of 
mortality, Gompertz assumed that the force of mortality (instantaneous death 
rate) increased with age because the 'resistance to death' declines with age. 
The change in resistance to death cannot be measured directly; it is a latent 
variable. Only the effect (death) can be measured. Change in resistance with 
age is an unobserved or latent causal process. Gompertz assumed that the 
resistance to death declines exponentially with age (the 'power to oppose 
destmction' and hence mortality changes exponentially as age increases). Each 
year, a person loses a constant fraction of his/her remaining 'vital force' or 
vitality. Gompertz summarised the effect of unobserved processes as a 
parametric form of time (age) dependence. Time dependence often reflects 
unobserved causal processes.^.Because of the link between observed time 
dependence and unobserved heterogeneity, part of the observed time 
dependence may be a consequence of the fact that there is a mixture of lifetime 
distributions generating the observed deaths. For instance, the Gompertz 
distribution can be obtained as a mixture of exponential distributions (Cox and 
Oakes, 1984 and Courgeau and Lelievre, 1992). 

The Gompertz model dominated mortality forecasting for more than 100 years 
(for a recent historical account, see Olshansky and Games, 1997). Several 
'laws of mortality', including the models by Thiele, Perks and Heligman- 
Pollard contain the Gompertz model as one of their components (for a 
discussion, see e.g. Hartmann, 1983; Rogers and Card, 1991; Horiuchi and 
Coale, 1990; Gage and Mode, 1993 and Tabeau et al, 1998). In his work on 



^ In the context of dynamic models of sociological processes, Tuma and Hannan 
asserted that 'building models with time dependence should not be an end in itself. 
Since sociologists seek to explain the processes that govern social structures, they 
want ultimately to measure the causal factors responsible for time variation in rates. 
Once causal factors have been measured, they should be incorporated explicitly into 
models.' (Tuma and Hannan (1984)). A similar view is held by Blossfeld and Rohwer 
who state that 'parametric models of tune-dependence should only be applied with 
extreme caution' (Blossfeld and Rohwer (1995)). 
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modelling ageing and mortality at the individual level, Manton (various 
publications, e.g. Manton, 1993a and Manton et al., 1997) introduces the 
Gompertz model to add an effect of 'intrinsic ageing'. Mortality is assumed to 
be related to unobserved age-related factors (the Gompertz component) and to 
the (quadratic) effect of factors that can be measured such as age, manifest risk 
factors and 'physiological history' of the individual (the non-Gompertz 
component). An interesting feature of the Manton model is that, as more 
knowledge becomes available on diseases and risk factors, the contribution of 
the Gompertz component is reduced (Manton et al, 1997). The Gompertz 
model represents a lack of knowledge on the biological mechanism of ageing 
and as more is known about the causal mechanism, the model becomes less 
important in predicting mortality rates. The Gompertz model is applied not 
only in mortality studies but also in duration analysis in general. For instance, 
the model is discussed by Blossfeld and Rohwer (1995) and its parameters are 
related to covariates. The popularity of the Gompertz model in research is not 
reflected by its use in practice. Statistical offices use the Gompertz model 
much less frequently. A survey of methods of mortality projections in 
industrialised countries found that only one country forecasts mortality based 
on a 'law of mortality'. Austria applies the Heligman-Pollard model (Gomez 
de Leon and Texmon, 1992 and Van Poppel and De Beer, 1996). 

In this chapter, I discuss the Gompertz model in relation to other models that 
are used in lifetime-data analysis, such as the logistic function, the Weibull 
distributions, the extreme value distribution, and the double exponential 
distribution. In Section 5.2 the Gompertz model is described as proposed by 
Gompertz himself. Basic properties and specifications of the Gompertz 
distribution are presented. Section 5.3 views the Gompertz from the 
perspective of survival analysis or duration (lifetime) data analysis. The model 
is related to three other models: the Poisson model, the logistic distribution, 
and the Weibull distribution. In Section 5.4, the Gompertz is presented as a 
member of the family of extreme value distributions. The Gompertz can be 
written as a tmncated extreme value distribution. This relationship sheds new 
light on the causal mechanism that may be hypothesised to underlie the 
Gompertz model. In addition, some properties of the extreme value 
distributions presented in this section are seemingly unrelated to the Gompertz, 
but have a potential for becoming important characteristics. Section 5.5 
concludes the chapter. 

Note that the review of the Gompertz function presented here is neither 
exhaustive nor comprehensive. Additional properties of the Gompertz 
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distribution and estimations from data using the maximum likelihood method 
are discussed by Pollard and Valkovics (1992). 



5.2 I The Basic Gompertz Model 

In developing his model of adult mortality, Gompertz (1825) attributed death 
of adults to two factors: deterioration of the power to withstand destmction, 
i.e. the resistance to death, and chance. Gompertz invoked chance to explain 
why members of a presumed homogeneous cohort die at different times 
(Gompertz, 1928; see Olshansky and Games, 1997). This section reviews the 
basic Gompertz model. The functions considered are the hazard function, 
which is the age profile of death rates, the survival function, and the 
probability density function. 

Let X be a random variable denoting age, or time elapsed since an event- 
origin. Realisation of X is denoted by x. At each age, an individual has a 
power to withstand destmction, i.e. a resistance to death. The resistance to 
death at age x is denoted by r(x). The notion of resistance to death is 
equivalent to the ability to maintain homeostasis and is consistent with 
biological theories of ageing that view ageing as a failure process, more 
particularly as an accumulation of failures in precknical form. When the 
number of failures or errors in the organism exceeds some threshold, the 
organism fails and the individual dies. In this view, the force of mortality 
depends on the number of errors in the organism. 

Gompertz assumed that r(x) declines with age at a constant rate: 

^ = - p , or in other words that = . p and d In r(x) = - p rix , with 

r(x) dx dx 

p positive. The solution of the latter equation is In r(x) = - px + C and 
r(x) = exp[-px + CJ , where C is the constant of integration determined from an 
initial condition. At the initial age of, say, 20 years, x = 0, r(0) = exp[C], 
and therefore C = In r(0). If the resistance to death at age 0 is denoted by Tq, 
then the resistance to death at age x > 0 equals: 



r(x)=r„ txp(-px) 
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Gompertz assumed that the force of mortality is inversely related to the 
resistance to death: 



^l(x) = 



B 

r(x) 



where |j(x) is the instantaneous rate of death at age x. At x = 0, p(0) = B/rg, 
which is denoted by 0. Hence the force of mortality is 



^(x) = 



B 



B 



r o^^V(-P^) r 



= — exp/" px] =9 exp/' px] 



x>0 



(5.1) 



where Tg is the resistance to death at the initial age. The parameter p measures 
the change in p(x) on the logarithmic scale: p = ln[p(x)/p(x-l)] = Inp(x) - 
Inp(x-l). The instantaneous death rate may be written as the log-linear model: 

lnp(jc/ = ln6 +p x = 9* + p X 

The logarithm of the mortality rate is a linear function of age (time). In some 
fields, such as in sociology, the log-linear model specification is common 
practice. The force of mortality increases exponentially with age at a rate p: 

1 da(x) 

p 

\x(x) dx 

When p = 0, the resistance to death remains fixed at the initial value, r(x) = 
Tg, p(x) remains constant at 0 for all x, and the Gompertz model reduces to the 
exponential model. Hence, 0 is the force of mortality under the exponential 
model. If the resistance to death is scaled such that rg is equal to 1, then p(x) = 
0 = B. 

To derive the survival function, an intermediate statistic is calculated; namely, 
the cumulative death rate or cumulative hazard rate. It is: 
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with 0 the i ni tial or baseline force of mortality, i.e. the death rate at x=0, and 
the change in the force of mortality on the logarithmic scale, and where A = 
0/p, i.e. the ratio of the i ni tial death rate and the rate of change. When p = 0, 
the force of mortality is constant and H(x) = 0X. When p is positive (0 < p 
< 1), the force of mortality increases with age, and A is larger than the 
baseline force of mortality. The larger p, the smaller A. 

The survival function is 



which may also be written as exp[A/'7-expf ^x)]\. The survival function is S- 
shaped and S(x) = 1 when x = 0. The mid-point of the curve or the point of 
inflexion, i.e. where the second derivative is equal to zero, is at S(x) = 
exp(A)/exp(l) = 0.368 exp(A). 

When p is very small, the Gompertz survival function tends to the exponential 
survival function. (5.2) shows that the cumulative death rate is H(x) = 0X and 
the survival function is exp[- 0x], which is an exponential function. The 
property can also be derived directly from the survival function (5.3) which 
may be conveniently written as 



To show that, note that if a scalar z approaches 0 (z -*0), then the expression 
(y* - l)/z goes to In y. In the example mentioned above, y is equal to exp(x) 
and z is equal to p. Hence, as p moves towards zero, then the survival function 
tends to exp[-0 ln{exp(x)}] = exp[-0x], as expected. 



S(x) = ex.-p[-H(x)] = exp[- 'k[ exp(^ px) - 1]\ 



(5.3) 



exp 



P 



The probability density associated with the Gompertz model is given by 
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f(x) = - = \i(x) S(x ) = e exp(" Qx) S(x) (5.4) 

ax 

which may be written as A,p exp[px + X[1 - exp( px)]\ 



5.3 I The Gompertz model as a model of survival and duration data 

5.3.1. The Gompertz Model and the Accumulation of Defects 

The concept of resistance remains abstract. What determines resistance to 
death? The concept may be better understood when it is related to defects 
accumulating over time. The approach is prevalent in radiation literature (see 
Olshansky and Carnes, 1997) and sociological applications of survival and 
duration analysis (see e.g. Tuma and Hannan, 1984 and Blossfeld and 
Rohwer, 1995). As a person ages, the number of defects accumulate. Let the 
death rate depend on the number of defects in the organism. Assume that the 
defects occur at a constant rate p (in terms of probability theory, it is the 
arrival rate of defects). The number of defects before time/age x, N(x), is a 
Poisson random variable which follows a Poisson distribution with the 
parameter px (^inlar, 1975). The expected number of defects in the interval 
[0,x] is E[N(x)] = px. The death rate at x depends on the number of defects at 
that age. Since the number of defects are not observed, they are approximated 
by their expected value. The death rate is related to the number of defects in 
the following way: In p(x^= px .Thus, if we hypothesise a causal mechanism 
that generates linearly increasing numbers of defects, the mortality rate 
changes following the Gompertz model. 

The accumulation of defects over time may be generalised to the accumulation 
of any attribute. As a result, the Gompertz model is being applied in studies of 
events other than death. For instance, in sociological studies of changes in 
employment status, job experience is the attribute being accumulated. When 
job experience accumulates linearly over time, i.e. increases with the number 
of years in a particular job, the rate of leaving the job follows a Gompertz 
distribution (Blossfeld and Rohwer, 1995). In that case, p is negative (as the 
job-specific experience increases, the rate of leaving the job decreases). Other 
examples are given by Tuma and Hannan (1984). In event history literature, 
the accumulation of an attribute is viewed as a latent covariate varying over 
time, i.e. time-varying covariates, and the linear function of time or duration is 
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only one of the possible duration dependencies (see e.g. Blossfeld and 
Rohwer, 1995). Considering the Gompertz model as a particular model of 
duration dependence is relevant for at least three reasons. First, the Gompertz 
model may be easily extended to include the effects of covariates. Let z denote 
a vector of covariates that do not depend on tim e or duration (or age). The rate 
at which the event (e.g. death) occurs at duration of age x may be described by 
the log-linear model: 

lnp(xJ = z'P + px 

which is a simple extension of the basic Gompertz model with a set of 
covariates. The model is the Gompertz model with 0* = z'P . Second, the basic 
Gompertz model and the extension may be estimated using standard software 
for the analysis of event history data, such as SPSS, TDA (Blossfeld and 
Rohwer, 1995) and tern (Vermont, 1997). Third, using the specialised 
software, the validity of the Gompertz model may be easily assessed in 
relation to the validity of related models, such as the model based on the 
Weibull distribution. 

So far, the Gompertz function has been derived from two perspectives, that 
proposed by Gompertz him self and that related to the arrival rate of defects. 
More causal mecha ni sms can be described that result in the Gompertz 
function. For instance, Abrams and Ludwig (1995) derive the Gompertz from 
the ‘disposable soma’ theory which states that senescence arises from a 
balancing of resources between reproduction and somatic repair. The 
relationship between reproduction and senescence is the main subject of 
evolutionary theories of ageing elaborated by Olshansky and Games (1994). 
More details on the ageing theories underlying the Gompertz model can be 
found in e.g. Olshansky and Carnes (1997); see also Yashin (in this volume). 



5.3.2. The Gompertz Model as a Generalised Logistic Distribution 

An interesting relationship exists between the Gompertz model and the logistic 
distribution. Both the Gompertz and the logistic are members of the 
generalised logistic distribution. The significance of the relationship for 
practical research is related to different slopes of two S-shaped distributions. 
Compared with the logistic, the Gompertz changes faster early but approaches 
the asymptote more slowly. The two curves differ in the tails and have 
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different inflexion points. The logistic distribution is widely used in survival 
analysis. The logit model, often used in discrete-time survival analysis, implies 
that the probability varies between zero and one following a logistic 
distribution. The Gompertz, which is also an S-shaped distribution, could be 
an alternative to the logistic when the shape of the distribution in the tails is 
important (see Section 5.4). 

A generalisation of the logistic distribution is 
ax q 



1 - 



^S(x)^^ 






\S(x) 



(5.5) 



where p is the base li ne rate of change, Sj is the maximum of the cumulative 
distribution (‘carrying capacity’ in logistic growth model; it is unity in the logit 
model), and ^ represents the asymmetry of the curve. When ^ = 1, the 
equation is logistic and the survival curve is symmetrical about the point of 
inflexion which is at S(x) = O.SSj. The hazard function varies linearly with the 
probability of surviving: 



1 dS(x) . . 

■ = -\^(x) = -p 



S(x) dx 




(5.6) 



which is the standard specification of the logistic model. The instantaneous 
death rate may be written as the l in ear function 

\i(x) = a- bS(x) (5 . 7) 

where a = p and b = p/S^ with Sj = 1 since the function considered is the 
survival function. The survival function is 

i (5.8) 

7 p exp(-px) 



For ^ < 1, the maximum change occurs for S(x) less than0.5Ss but greater 
than SJe (Parthasarathy and Krishna Kumar, 1992). When ^ = 0, the 
Gompertz model arises. To show that, we need an asymptotic property already 
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used in Section 5.2: if a scalar z - 0, then (y^ - l)/z approaches In y. Let ^ = z 
and y = then 



S(x) 

5 , 

The Gompertz model is: 



lim r 

5 -► 0 S 



r s(x) y ^ 


~ In 


V 5 s J 





dS(x) 

dx 



= - p S(x) In 



S(x) 



(5.9) 



Let InSj = 0/p (which is A), then two interesting equations arise for the rate of 
change of the survival function. First, Equation (5.9) may be written as 



1 dS(x) 
S(x) dx 



— - In S(x) 
P 



= - [0 - p In S(3cj] 



and the force of mortality is 

\i(x) = a-b\n S(x) (5 . 10) 



where a = 0 and b = p. Equations (5.7) and (5.10) show the relations 
between the force of mortality and the survival function in the logistic and the 
Gompertz model. 

Second, a comparison of Equations (5.4) and (5.9) indicates that 

exp(" px) = l-^ biS/x) (5.11) 

9 



Unlike the logistic, the Gompertz is not symmetric. If this shape is prevalent in 
the data, the Gompertz model should be selected. If the data resemble the 
logistic more than the Gompertz, the logistic should be selected to avoid 
problems of misspecification. An advantage of representing the two models as 
special cases of a more general survival model, is that the more general model 
may be estimated and the value of the parameter 0 determines which model. 
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the Gompertz or the logistic, is a more appropriate mathematical 
representation of the data. 

From (5.9), one may derive an expression for the death rate under the 
Gompertz model comparable to Equation (5.7) that represents the expression 
under the logistic model. 

In biology, the generalised logistic distribution (5.5) is also known as the 
Richards’ family of growth models or Richards function (Richards, 1959, 
1969 and Brown and Rothery, 1993). A modification of this family of models 
was presented by Heitjan (1991a, 1991b) and applied by Lambert (1996). 
Recently, a new function was derived that includes the Gompertz, the logistic 
and several other models of change (France et al, 1996). Another interesting 
extension is by Thomley (1990), who gives a new formulation of the logistic 
in terms of two differential equations which permits the asymptote to depend 
on conditions during the period of change. A few studies assess the effects of 
specifying a Gompertz versus a generalised logistic: Franses and Van der Nol 
(1997) in the field of marketing, and Lim et al. (1998) in botany. No 
comparative studies are known in survival analysis. As far as I know, the only 
reference to the generalised logistic distribution (5.5) in the field of 
demography was by Biswas (1998). 



5.3.3. The Gompertz Model and the Weibull Distribution 

Another function to which the Gompertz is related is the Weibull. Imaizumi 
(1996) applied the Gompertz and the Weibull to mortality rates from breast 
cancer in Japan from 1950 to 1993.They found that the Weibull performs 
better than the Gompertz. Prieto et al. (1996) used the Gompertz and the 
Weibull models to analyse Spanish mortality data from 1900 to 1992. They 
found, however, that the Gompertz model fits age-specrfic mortality rates of 
adult men and women better than the Weibull distribution. Despite these 
contradicting results, the Weibull may be of interest because of its ability to 
describe mortality patterns. 

The Gompertz and the Weibull are related via a transformation of the age 
variable. If the age scale is transformed into log scale, then the Weibull 
distribution becomes a Gompertz distribution. In other words, if the age at 
death X follows a Weibull distribution, then the random variable Y = In X 
follows a Gompertz distribution. For further details on the relationship 




5. Gompertz in context: the Gompertz and related distributions 



117 



between the Gompertz and the Weibull in the context of lifetime data analysis, 
see Courgeau and Lelievre (1992) and Petersen (1995). 

Consider the equation 

In \i (x) = a + bx + clnx (5.12) 

If c = 0, the function is the Gompertz model; if b = 0 and c > -1, it is the 
WeibuU model; if b = 0 and c = 0, it is the exponential distribution. In 
regression analysis, Equation (5.12) may be applied to explore which of these 
functions better describe a given mortality profile. When the profile follows 
the Gompertz model, b is significantly different from zero while c is not 
significantly different from zero. The hazard rate (force of mortality) and the 
survival function of the Weibull model are: 

p.(x) = Xp [ pxf-' 

S(x) = exp[- [ px y’'] (5.13) 

An interesting extension of the Weibull is the Burr distribution that describes 
mortality which first increases, reaches a peak at a certain moment, after 
which it slowly declines. The Burr distribution results from a Weibull 
distribution with p following a gamma distribution (Gupta et ah, 1996). It 
captures unobserved individual variation in what Olshansky and Carnes (1997) 
call ‘endowments for longevity’. Manton et al. (1993) considered a case where 
the age profile of mortality rates of members of a cohort is described by a 
Weibull distribution and the unobserved heterogeneity among cohort members 
(frailty) is gamma-distributed. However, neither Manton et al. nor Blossfeld 
and Rohwer (1995), who give an introduction to the properties of Weibull 
models with gamma mixture, mention the Burr distribution. 



5.4 I The Gompertz Distribution as an Extreme Value Distribution 

5.4.1. The Gompertz as a Truncated Extreme Value Distribution 

Extreme value distributions are employed to describe the failure times or 
lifetimes of systems that cease to function whenever the weakest component 
fails. A significant observation, which has not yet been considered in 
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demographic analyses, is that the Gompertz function is a tmncated Type 1 
extreme value distribution (Johnson et al, 1995). The function is tmncated 
from below at x = 0. Tmncated distributions are used occasionally. Some 
tmncated distributions are used regularly, such as the tmncated normal 
distribution which, in econometrics, is referred to as the Tobit model. 

It is easy to show that the Gompertz is a tmncated extreme value distribution. 
Consider the Type 1 extreme value distribution (see e.g. McCullagh and 
Nelder, 1989): 

S(x) = exp/- exp/ px) ] 

The cumulative hazard is the exponential function H(x) = exp/ px] and the 
inverse function is ln[-ln/5(3cj/]= px 

A generalised Type 1 extreme value survival function is given below 

s' (x) = exp[- X exp/ px)] - oo < x < oo 

The survival function tmncated at 0 is the following 

S(x) = s' (x)/ s' (0) = ^ exp[A./7 - exp/ px)]] x>0 (5.14) 

exp/- A, ) 



which is the Gompertz (see Equation 5.3). Elence, the Gompertz is a tmncated 
Type 1 extreme value distribution. 

The density associated with the Gompertz is given by (5.4). If the origin 
(tmncation) is at Xq instead of at 0, the density is (Johnson et al., 1995): 



f(x’)=B 'exp 



px' +—[l - exp/ px'// 



P 



(5.15) 



where x' = X - Xq and 0 ' =0 exp/ p xo) > which is the force of mortality at x^. 
Thus tmncating a Gompertz distribution at x^ and setting the origin at Xq leaves 




5. Gompertz in context: the Gompertz and related distributions 



119 



the distribution unchanged except that the constant 0 changes to 6'. The density 
may also be written as : 



f(x') = p^' exp[px' + 'k'[] - expi' ?x')]\ with A,' =0 /p (5.16) 



5.4.2. Possible Implications of the New Perspective 

The link between the Gompertz distribution and the extreme value distribution 
and the application of extreme value theory may open new research 
perspectives. For a recent illustration, see Aarssen and de Haan (1994). We 
therefore consider some additional properties of the extreme value distribution. 
The relevance of the properties may not be immediately clear. They indicate, 
however, a promising direction of mortality research for two reasons. The first 
reason is substantive. The research may point to a link between death and a 
component crossing a certain critical threshold (maximum value). Aspects of 
theories of ageing, not captured by the Gompertz, may be captured by other 
members of the family of extreme value distributions. The second is 
methodological. Scientific and practical interest in the extreme value 
distribution is increasing rapidly (see e.g. Beirlant et al, 1996). The findings 
may be applied to improve on the Gompertz model to describe and forecast 
mortality. Because of its potential, the main properties of the extreme value 
distribution, collected from probability theory, are listed in this chapter. 

The Type I extreme value distribution may be written as 

Pr{X <x} = exp[- exp(- (x-^)/Q )] (5.17) 



where ^ is a location parameter and 0 is a scale parameter; 0 is different from 
the 0 used before. The probability density function is the following 

f(x ) = 0 exp(- (x-^)/Q) exp[- exp(- (x-^)/Q)] (5.1 8) 



If ^ = 0 and 0 = 1, or equivalently, if Y = (X-Q/Q, we have the standard 
form: 
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f Y(y)=^^T>[-y-^^v(-y)] 

Some properties of the Type I extreme value distribution are the following: 

a. If the random variable X has a Weibull distribution, then Y = In X follows 
a Gompertz distribution and a Type 1 extreme value distribution. 

b. If the random variable X has a Type 1 extreme value distribution with the 
density in its standard form 

f(x) = exp[- X - expf-xj] (5 . 19) 

then exp(-X) has a standard exponential distribution (Johnson et al, 1995). 

c. If two independent random variables each have the same Type 1 
distribution, then their difference has a logistic distribution. In the 1970s, 
this observation was the basis for the development of the logit model in the 
context of discrete choice theory (McFadden, 1974). If the utility being 
maximised is measured by a Type I extreme value random variable, the 
optimal choice among discrete alternatives is given by the logit model 

d. The Type I extreme value distribution may be generalised by introducing an 
extra parameter k (the generalisation is due to Dubey (1969; see Johnson et 
al, 1995)). The cumulative distribution is: 



Pr{X <x}~ exp 



-K0 exp 




(5.20) 



However, since: 



K0 exp - 





x-i^ -0 ln^K0 ) 
0 





with ^ = ^ +0 ln(^K0 ) , it can be seen that X still has a Type I extreme value 
distribution. 

The model has an interesting extension in case of unobserved heterogeneity. 
When the parameter k is a random variable and follows the gamma probability 
density [Gamma(p,p)]; 
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Pjt) = ^ t”'' exp/'- ^t] t>0;p>0;^>0 

then the distribution that results is a generalised logistic distribution, known as 
the Type I generalised logistic distribution (Johnson et al, 1995). 

Consider a different generalisation: 



Pr{X <x} = l - exp 



-K0 exp 




Using the same gamma distribution, the cumulative distribution function is 



Pr{X <x} = l- exp[- p(x -^)^] 



- + exp| 




(5.21) 



This distribution is the Type II generalised logistic distribution. The Type I 
and Type II generalised logistic distributions are related by a simple negation 
of the random variables: if X follows a Type I generalised logistic distribution, 
then -X follows a Type II generalised logistic distribution (Johnson et al, 
1995). 

e. Another generalisation of the standard Type I extreme value distribution is 
the standard log-gamma function (Johnson et al, 1995): 

fr (y) = 7:77 exp[py - exp(y)] -co< y<co; p>0 (5.22) 

y(p) 



If Y follows the log-gamma distribution and p = 1, then -Y follows a standard 
Type I extreme value distribution. By introducing a location parameter ^ and a 
scale parameter 0, we obtain the three-parameter log-gamma density function 
(Johnson et al., 1995): 



1 



exp 



p(x-^)/d - exp 







p>0,Q>0 
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which is, surprisingly enough, the Coale-McNeU model of first marriage rates 
(Coale and NcNeil, 1972) and known in demography as the double 
exponential distribution (Rogers, 1986). To see the equivalence, let p/9 = a 
and 1/0 = A such that p = o/A, and let p = -C If X is log-gamma distributed, 
then Y = -X has the Coale-McNed probability density function: 

fr(y> ^«xp[-«0' - P J - “p/'-Vp- Mi/] (5.23) 



For the proof of Equation (5.23), see Liang (1997). The proof shows the 
Coale-McNeil model as a generalisation of the Type I extreme value 
distribution (for further details on the link between the Coale-McNed model 
and the extreme value distribution, see Liang (1997)). Note that, if a = A = 1 
and the location parameter ^ = p = 0, then 

//y; = exp[-y-exp(^-y;] 



5.5 I Conclusion 

Mortality forecasting most often relies on trend models. Although process 
models are advocated as being superior, the knowledge base that is required to 
model the causal mechanisms implicit in process models remains inadequate. 
Part of the gap between trend models and process models may be bridged by 
viewing ‘laws of mortality’ as models that have been developed in statistics to 
analyse duration data, including survival or lifetime data, and event-history 
data. In this view, mortality analysis and mortality forecasting is situated 
within the context of duration analysis and event-history analysis. The 
adoption of this mental frame requires researchers to view (i) death as a life 
event, not much different from other life events, and (ii) the age-specificity of 
mortality as an illustration of duration dependence, much in the same way that 
Gompertz associated the increasing rate of mortality during the life course with 
the declining vital force, which is itself a result of the accumulation of factors 
that reduce the ability to resist death. This chapter discussed the Gompertz 
model within a broader context of models that include the logistic distribution, 
the Weibull distribution and the extreme value distribution. A particularly 
interesting perspective on the Gompertz is to view it as a trancated Type I 
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extreme value distribution. This perspective triggered further investigation of 
the family of extreme value distributions. Surprisingly enough, a 
generalisation of the Type I extreme value distribution resulted in the Coale- 
McNeil model of first marriage rates, also known in demography as the 
double exponential distribution and used by Rogers (1986) to model a variety 
of age profiles including the mortality profile. The relationship between the 
Gompertz and the Coale-McNeil model is worth further investigation. A 
particularly interesting observation in that respect is that the Coale-McNeil 
model may be written as the convolution of a normal distribution and a few 
(three) exponential distributions, which represent delays or reaction times 
(Coale and McNeil, 1972). Although the significance of this observation is not 
clear at present, it may indicate the direction of research that is required to 
uncover the causal mechanisms underlying the changes in the ‘resistance to 
death’ . This chapter includes a number of features of the Gompertz model and 
related models that, although seemingly awkward today, may one day be used 
in the ongoing attempt to improve mortality forecasts. Improvements are 
expected in (i) the representation of the different shapes or curvatures of 
mortality profiles, including shapes at high ages, (ii) the statistical techniques 
to estimate the model parameters from the data (e.g. maximum likelihood 
method), and (iii) the application, in the field of mortality analysis and 
forecasting, of standard software for the statistical analysis of lifetime data. 
More important, however, will be the detection of new or better theories of 
ageing that result from applying extreme value theory. 
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6. Comparing Theoretical Age 
Patterns of Mortality Beyond 
the Age of 80 



Lech BOLESLAWSKI and Ewa TABEAU 



Abstract 

Data on mortality from four European low-mortality countries, France, Italy, 
the Netherlands and Norway, and more than 40 recent years (1950-1994) have 
been used to investigate the usefulness of 11 functions representing the age 
pattern of mortality at the oldest ages. Functions with two parameters proved 
to be inadequate to fit mortality rates in the age interval 80-110 years. Less 
parsimonious functions, i.e. with three and more parameters, perform better. 
When reliable data exist up to a very old age, i.e. ideally up to 110 but at least 
up to 90 years, the Coale-Kisker method is suggested as the best choice. If the 
data for the oldest ages are not reliable, extrapolation of the age pattern is 
recommended by using a polynomial, provided that the quality of data is 
satisfactory for all ages up to 85 years. 



6.1 I Introduction 

In the distant past, the age pattern of mortality of the oldest old (referred to 
here as 80 and over) was not well-known. Knowledge of mortality among 
persons approaching the age of 100 (or more), in particular, was mainly based 
on unreliable statistics and theoretical concepts. In recent decades, this 
situation has gradually changed. Efforts have been made to improve past as 
well as recent mortality statistics and to develop methods of estimating 
mortality age patterns in cases where data are inaccurate. The reason 
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underlying these new developments is quite simple. As a result of a 50-year 
persistent decline in mortality, the number of persons surviving to very old 
ages has increased dramatically. As a result, institutions involved in social 
policies that strive to balance the needs of, and resources for the elderly, have 
become interested in reliable and detailed population projections. 

The need for detailed population estimates and projections for the elderly, 
including the oldest old, was addressed by the United Nations (UN) Working 
Group on projecting old-age mortality and its consequences, held in New York 
in December 1996 (United Nations, 1997). The participants in the Working 
Group accepted the fact that the population numbers and respective mortality 
indicators included in the official UN population projections that have been 
presented systematically in the World Population Prospects should be 
displayed by detailed partial age groups up to age 100, and with an open-ended 
age interval 100 years and over. Regarding mortality, only a few countries in 
the world (e.g. France, the Netherlands, Norway and Sweden) have 
sufficiently accurate empirical data up to age 100 that can be used directly. For 
countries with less reliable or unavailable mortality statistics for ages from 
approximately 80-85 years onwards, estimation of age-specific mortality is 
required. Such estimations can be supported by an analysis of the high-quality 
data from the few countries mentioned above. With respect to the methods 
needed for such estimations, a number of suggestions have been formulated by 
the UN Working Group. The UN propositions included parameterisation and 
other modelling techniques based on age-specific mortality indicators, and 
were designed to establish a procedure capable of providing reliable estimates 
of the oldest-old mortality in the majority of countries in the world. 

Regarding data quality, two groups of countries can be distinguished in the 
developed world. The first group includes countries with high-quality data up 
to the age of approximately 85 years and less reliable (or incomplete) mortality 
data thereafter. The second group includes countries with good data up to a 
much higher age, i.e. 95 years or older. At present, only a minority of 
countries have reliable data up to a very old age. A sound mathematical 
representation of the mortality age schedule seems to be urgently needed for 
purposes such as, for instance, international comparisons or projections. 
Mathematical models of old-age mortality in developed countries have 
therefore been chosen as the subject of this chapter. Both the modelling of 
complete reliable empirical age patterns and the modelling of data with an 
open-ended age interval 85-i- were analysed and are presented in the text. 
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Most importantly, the following sections of this chapter show that due to the 
properties of the empirical age pattern of mortality, different models might be 
preferred in each of these two cases. 

We begin with a brief discussion of the shortcomings of data and possible 
improvements in this area, and recently recommended models (Section 6.2). 
We continue by describing the empirical age patterns in four countries (Section 
6.3). Results of an analysis of the complete and reliable age patterns of 
mortality in four countries are discussed in Section 6.4. Here, 1 1 mathematical 
functions — possibly relevant as representations of old-age mortality — have 
been tested from the point of view of their performance in fitting the empirical 
data. The survey includes the functions recently recommended by the UN 
(e.g. some logistic-like functions) as well as those considered less suitable for 
mortality of the oldest old (e.g. Gompertz) and a number of other functions 
less commonly dealt with by researchers (e.g. polynomials). It is generally 
assumed that the properties and parameters of these functions cannot be given 
any meaningful interpretation other than a mathematical one. Specifically, a 
function that fits the empirical age pattern of mortality rather well cannot be 
treated as a law, but merely as a technical tool. Modelling of the deficient or 
incomplete age patterns (that is, reliable up to a relatively low age, say 85 
years, and unreliable or missing thereafter) is elaborated on in Section 6.5. In 
this section even more models (14 instead of 11) are dealt with in the context 
of extrapolation of the age pattern of mortality. Conclusions and 
recommendations are summarised in the discussion in Section 6.6. 



6.2 I Improved Data, New Models 

Data on mortality above age 80 are subject to distortion. This is partly due to 
random fluctuations in deaths which become remarkably large for ages 95 
years and over, even in countries with large populations. Random (or non- 
systematic) errors are mainly caused by a lack of caution in gathering the data, 
and therefore tend to decline with increasing population size and with more 
advanced levels of econo mi c development. Other distortions may result from 
non-random reasons, such as age misreporting in death certificates or censuses 
and errors in the estimates of population counts. Systematic errors resulting 
from age 'exaggeration' by relatives are visible in mortality statistics, as 
reflected in an unusual cumulation of deaths at round ages, especially at age 
100, and a related underreporting of deaths at ages directly preceding the age 
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concerned, i.e. 96 to 99 years. Distortions such as these are observed 
independently of population size. 

The age pattern of mortality at the oldest ages has only become better known 
in recent decades. In the past, the pattern was largely hypothetical, assumed on 
the basis of extrapolations from younger ages using models such as, for 
instance, the Gompertz law. The lack of reliable data above 95 years implied 
that the age-related increase in mortality was itself questioned. Historical data 
from several countries seem to suggest that the rate of increase might be 
almost invariant above a certain age, or may even decline. More recent data 
indicate that considerable fluctuations exist in the rate of change in mortality 
with age. Patterns of fluctuations are addressed in Section 6.3. 

An inventory of inconsistencies in mortality statistics for advanced ages can be 
found in Condran, Himes and Preston (1991), who, using the inter-censal 
cohort method, analysed death and census-based population data from 18 
developed countries over a period of 35 years. Coale and Kisker (1990) also 
examined the quality of recent (United States (US)) mortality rates. They used 
the method of 'extinct generations' and proved that the decline in mortality 
rates for older ages, especially in the case of the non-white population, is due 
to errors in age reporting. 

Recent knowledge of the age pattern of mortality of the oldest old seems to 
clarify many previous doubts regarding the shape of the age curve of 
mortality. In recent decades, mortality statistics have been considerably 
improved in many developed countries. Special projects were designed to 
correct mortality and population numbers back to the early 1950s, using birth 
records for checking age reporting of deaths as well as the extinct generations 
estimates of population counts. Examples of such projects include studies 
completed at Odense University in Denmark and at the Netherlands 
Interdisciplinary Demographic Institute (NIDI) in the Netherlands. This 
chapter draws on the improved statistics and data that have been derived from 
the mortality database developed at NIDI for use in another recent NIDI 
project. A more detailed description of the data can therefore be found 
elsewhere (Tabeau et al, 1997). The database was developed in co-operation 
with Odense University. 

Many scientists draw attention to a remarkable feature of the age pattern of 
mortality, namely a diminishing rate of the age-related increase in mortality 
rates above a certain age. This has led researchers to seek models consistent 
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with this property. In the 1930s, different applications of the logistic function 
were shown for mortality by Perks and the usefulness of the exponential- 
quadratic function was suggested by Trachtenberg. Some of the most recent 
modelling approaches also employ these two important functions in modelling 
the oldest age mortality. Examples are the complex parameterisation function 
for the death probability (qj of Heligman and Pollard (1980), the model for 
the rate of the age-related increase in mortality (kj proposed by Coale and 
Kisker (1990) and the relational model for the mortality rate (mj developed by 
Himes, Preston and Condran (1994). 

The Heligman-Pollard (H-P) model in its original version has eight parameters 
and covers all ages from birth up to the most advanced ages. When applied to 
older ages, the model reduces to one component and has only two parameters 
(x is age and a, b are parameters to be estimated from data): 

“ l + be^^ 

The Coale-Kisker (C-K) model is based on the concept of the exponential age- 
specific rate of change of mortality rates: = ln(/Wjj/w^_i) . In this model the 

rate of change is assumed to be linear above the age of 85 years: 

=^85 -( x - 85)'5 

which is a well-documented property (Horiuchi and Coale, 1990). 

Two assumptions underlie the formulation of this model. First of all, the 
estimates of mortality rates around age 85 are assumed to be reliable, implying 
that the parameter k^^ can be calculated directly from the empirical data. 
The second assumption of the model is related to a predetermined value of the 
mortality rate at 110 years, the highest attainable age. 

This assumption enables the computation of the second parameter s. Coale and 
Kisker assume for an arbitrary value of s equal to 1.0 for men and 0.8 
for women, both values taken from recent Swedish mortality. 

As shown by Wilmoth (1995), the Coale-Kisker model implies an exponential- 
quadratic function for mortality rates with respect to age: 
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m 



X 



^ax^+bx+c 



The Himes-Preston-Condran (H-P-C) relational model for old-age mortality is 
built up of two components: a set of standard age-specific mortality rates based 
on empirical data from several countries up to age 95 and a model age 
schedule extending thereafter up to 115 years. This model is specified 
according to the linear increase with age in the logit transformation of age- 
specific mortality rates: 

logit(/M ^ ) = a + P • X 



which is equivalent to the following formulation: 



be^ 



= 



l + be^ 



A striking similarity is seen between the above function and the Heligman- 
Pollard model discussed earlier. The only difference is the dependent variable 
which is the death rate in the first model and the death probability in the 
second. The parameters of both these functions can be estimated using the 
empirical mortality rates for ages 80 years and over. 



6.3 I Mortality at Age 80 to 109 Years in Four Countries 

In this section, empirical mortality rates, by sex, are discussed for France, 
Italy, the Netherlands and Norway, in the years 1950-1994. The rates were 
estimated for single-year ages from 40 to 109 years (the highest age, llO-i-, 
was excluded from detailed investigations) by dividing the annual numbers of 
deaths by respective mid-year population counts. Both variables come from the 
NIDI database. For three countries, France, the Netherlands and Norway, a 
complete series of deaths including all ages from 40 to 109 and all years from 
1950 to 1994 was available. For Italy, single-year deaths were missing below 
age 80 for many of the early years in 1950-1994. Five-year ages were 
available instead but these were not considered to be a good approximation of 
the age pattern of Italian mortality. This implied that Italy was excluded from 
some analyses. In some analyses the highest age was assumed to be 105 years 
instead of 109, which was related to the very small numbers of deaths 
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observed at ages from 106 to 109. The population structure, in the form 
provided by official sources (i.e. population registers, censuses and inter- 
censal estimation using births, deaths and migrations), was available by single 
ages for each of the four countries for the entire period 1950-1994. Note that 
the population numbers used in our study were only partly the same as those in 
official statistics. In fact, the population numbers used in our analyses 
consisted of two components. The first, comprising the ages from 40 to 79, 
was identical to the numbers given in the official statistics, and the second, for 
ages 80 and over, included population numbers obtained from the extinct 
generation method. In this method, the population at a particular age at the end 
of a year is estimated (cohort-wise) by subtracting the annual number of deaths 
at this age from the population at the beginning of the year. The impact of 
migration is not taken into account. The 'extinct generations' population was 
taken from the Odense Oldest Old Mortality Database in Denmark. 

The main principle of our investigation was that the random fluctuations 
present in the data had to be minimised. In order to do so, the data for four 
countries were pooled. Several pools were analysed. The broadest pool 
included all countries and all years {Figure 6.1). In the second pool, the annual 
data from the years 1950-1994 were combined for each country separately 
(Figure 6.2a). Finally, to analyse mortality changes over time, an aggregate of 
the data for the four countries in subsequent decades (1950-59, 1960-69, ...) 
was investigated (Figure 6.2b). It is worth mentioning that the total number of 
deaths at ages 85 or over in the four countries taken together over the period 
1950 to 1994 was 2.7 million men and 5.1 million women. The number of 
those who died as centenarians (100 years or older) was over 12 thousand men 
and almost 44 thousand women. 

Figure 6. 1 shows that mortality rates follow a regular pattern up to age 105, 
except for a clear heaping at age 100. This was clearer for men than for 
women. The pattern above age 105 continues to rise, despite a drop at age 
106. The increase at ages 107 to 109 seems to be too pronounced, especially 
for men. One may therefore conclude that the less rapid increase and more 
regular pattern for women is due to the greater number of cases among 
women. 

Apart from the overall pool of four countries and all calendar years, two more 
pools are considered. These are shown in Figure 6.2 Mortality rates in these 
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Figure 6.1. Old-age mortality in four countries 1950-1994 




Age 



Note: Pooled data from France, Italy, the Netherlands and Norway. 



pools show more variation than in the overall pool, which is likely to be a 
random effect. In addition to the differences between the mortality patterns of 
men and women, the fluctuations appear to be greater in smaller countries as 
well as in decades in which the number of deaths is low. In our case, the 
respective numbers of deaths for the decades studied were 0.8, 1.6, 3.2 and 
5.3 thousand for men and 2.4, 5.3, 10.0 and 20.0 thousand for women. 

Interestingly, as shown in Figure 6.2a, in the years 1950-1994 old-age 
mortality was highest in Italy. This was the case for both men and women. In 
Norway, on the other hand, mortality in old age was lowest, with the 
exception of ages beyond approximately 100 years, at which age mortality 
among French women was even lower than among women in Norway. 
Another interesting finding is that mortality in the Netherlands, particularly 
among men, was almost identical to that in Norway up to the age of 
approximately 95 years. At older ages, Dutch mortality is clearly higher than 
Norwegian mortality. 
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Figure 6.2a. Old-age mortality by countries and decades, countries 
(pooled data from 1950-1994) 



Men 



Women 





A«« A|f 



Figure 6.2b. Old-age mortality by countries and decades, decades 
(pooled data from four countries) 



Men 



Women 
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When we take a closer look at the decline in old-age mortality over time 
(Figure 6.2b), we see that mortality has dropped at a surprisingly regular pace 
since the 1960s. This suggests that an even greater decline may be possible in 
the future. Having said that, given the current extremely low death rates after 
age 80, a further systematic decline seems to be unlikely. This implies that the 
rates of decline over time may decelerate. 

Figure 6.3 shows the exponential rate of change in mortality with age. The 
empirical rate for both men and women increases rapidly from the age of 
approximately 52 (women) and 60 (men) years up to the age of 75 years. 
Thereafter, sharp drops are seen, especially for women, until the age of 104 
and 101 years, respectively. This regular pattern does not include the 
irregularities in the age interval 96-99 (women) and 94-99 (men). For extreme 
ages, i.e. above 105 years, large increases are observed for both sexes from 
one age to the next. Due to the small study population, however, this cannot 
be considered ultimate and universal. 

The increase in the exponential rate of the age-related mortality change for the 
younger old (below age 75) and the deceleration of this rate for the older old 
(over the age of 75 years) are important findings and must have implications in 



Figure 6.3. Exponential rate of change of mortality with age 
(three countries, 1950-1994) 
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terms of modelling of age patterns. Both have also been documented in earlier 
publications, mainly in the work of Horiuchi and Coale (1990) who, using the 
same concept, investigated mortality data from several developed countries. 
The empirical data presented in our chapter (strongly) confirm this property 
for women and (less strongly) for men. In the literature, the rate of change at 
ages higher than 100 years has been documented only for a few countries (e.g. 
Wilmoth, 1995), and was assumed to decline further with age. Our data on 
mortality at ages above 105 from four countries do not confirm this pattern, 
but this is probably related to the size of our study population. 



6.4 I Fitting Models to Data for Ages 80-109 

Old-age mortality data which include ages above 100 are seldom studied. 
Thatcher et al. (1994) analysed a female cohort of pooled data from 11 
countries to verify consistency of the empirical age-specific mortality pattern 
with several well-known models (Gompertz, Weibull, Heligman-Pollard and 
Perks, exponential-quadratic and 2-parameter logistic). Their results show that 
the first three models deviate strongly from the empirical data. Wilmoth 
(1995) used data from Sweden and Japan to study the validity of the 
exponential-quadratic function employed in the Coale-Kisker relational model 
and found that this function "provides an excellent approximation to reality". 

The 11 functions taken into consideration in this chapter are listed in Table 
6.1. All but one are formulated in terms of the force of mortality . For the 
sake of simplicity of the calculations, the force of mortality has, however, 
been replaced in our study by the central death rate . One function 

(Heligman-Pollard) is specified for the probability of dying k The functions 

differ from each other with respect to how they describe mortality changes 



^ To ensure that this function is comparable with those defined for mortality rates, the 
model-implied values of death probabilities from the HP model were converted into 
the rates using the transformation: m(x)=q(x)/(l-(l-f(x))q(x)). The values of f(x) 
were calculated from data up to the age 95 years. For the ages from 95 onwards, 
linear interpolation was applied between the following arbitrary assumed values for 
ages 100, 105, 110, 115, 120, 125 and 130 years: 0.450, 0.420, 0.385, 0.345, 0.300, 
0.250 and 0.200, respectively. 
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Table 6.1. Mortality models employed in the survey 



Type 


Number 


Designed 




Formula 


Abbreviation 




of 


or used by 










parameters 










exponential 


2 


Gompertz 




= he‘" 


GOMP2 


shifted exponential 


3 


Makeham 




= c + be‘^ 


MAKE3 


exponential-quadratic 


3 


Coale-Kisker 




_ ^a+bx+cx^ 


EXQU3 


logistic 2-param 


2 


Himes- Preston- 
Condran 


w. 


be‘^ 

~ \ +be‘" 


LOGO 


logistic 3-param 


3 


Beard 




be‘^ 

~ 1 + €0°^ 


LOGO 


logistic 4-param 


4 


Perks 




d + be‘^ 
~ l + ce‘^ 


LOG14 


logistic 2-param 


2 


Heligman- 

Pollard 


(lx 


be‘^ 

~ \ + be‘^ 


HELI2 


power 


2 


Weibull 


(^x 


= bx° 


WEIB2 


shifted power 


3 




nix 


= c + bx“ 


WEIB3 


polynomial 3-param 


3 




rrix 


- a + bx + cx^ 


POLY3 


polynomial 4-param 


4 




nix 


= a + bx + cx^ +dx^ 


POLY4 



with age. Some assume a rapid exponential increase (Gompertz and Weibull), 
whilst others only assume an asymptotic increase (logistic). The exponential- 
quadratic is a well-known bell-shaped function, while polynomials may 
increase or decrease infinitely, depending on the number of parameters. These 
properties are not decisive for the possible usefulness of these functions, which 
is related to the fact that the functions are only employed up to the age at 
which survivors are observed. 

In order to evaluate the usefulness of a particular function, we investigated 
how it fits the empirical data rather than considering theoretical arguments. 
We applied the eleven functions to the same set of data and indicated the best 
and worst functions on the basis of certain criteria. 
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The maximum likelihood (ML) method, given the Poisson-distributed errors, 
is the optimal estimation approach to the model age schedules of mortality 
analysed here (McCullagh and Nelder, 1989). However, the weighted least 
squares (WLS) method, with weights taken as reciprocals of the variance of 
the error term, can be considered to be approximately equivalent to the ML 
estimation (e.g. Hoem, 1972; also: Van Imhoff and Tabeau, 1998). If the 
weights are inversely proportional to the number of deaths at subsequent ages, 
they can be seen as representing the reciprocals of the variance of the mortality 
rate modelled (e.g. Van Imhoff, 1991). The WLS method, with the weights 
based on the (observed) number of deaths, was applied in our study. The WLS 
method was selected due to its simplicity and because no special software was 
needed. For the purpose of comparing the functions tested here, the choice of 
the ML method was not cmcial. 

The functions were fitted to mortality data from four countries (overall pool) 
using the weighted least squares method, with the loss function expressed in 
terms of numbers of deaths, employing two methods of weights. Method 1 
uses reciprocals of squared deaths as weights: 

Z {d. - w, pJ' jol or equivalently ^ )lm^ f 

X X 

where: — deaths, — mid-year population, — empirical 

mortality rate, — model-implied mortality rate from a given function. 
Method 1 treats aU observations equally, as if they were non-random, and 
minimi ses relative deviations. In order to account for random fluctuations 
which are essential at older ages, method 2 is used. The weights in method 2 
are inversely proportional to deaths, thus approximately equal to the variance 
in the binomial distribution of deaths: 

Z(^x -fnxPx)^ Id, 



This loss function favours the observations for younger ages and puts less 
emphasis on observations at older ages where the deviations in dea t hs tend to 
be higher, unless a function fits both groups of ages equally well. 

The given empirical pattern of mortality (i.e. the overall pool for aU countries 
and all years) is, of course, not necessarily universal. Nonetheless, some 
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conclusions on the performance of the different models when applied to this 
pattern are worth mentioning (see Table 6.2). Firstly and most obviously, the 
two estimation criteria provide a slightly different rank order of models, which 
can be easily seen from the sum of squared residuals (SSR), although generally 
the models with a greater number of parameters fit the data better. 
Interestingly, the gain in the goodness of fit from using the 3-parameter instead 
of 2-parameter models is much higher than the gain from using the 4- 
parameter models instead of 3-parameter representations. The goodness of fit 
in the 3- and 4-parameter models is quite similar, with a slight preference for 
the latter. Taking into account the principle of parsimony of model 
formulation, the group of 3 -parameter functions can be recommended as a 
reasonable choice for modelling mortality of the oldest old. To avoid serious 
mistakes, the worst models must be mentioned. These are all 2-parameter 
functions, except perhaps for the H-P-C logistic function, which for men does 
not perform identically in the two estimation methods. The SSR in the 
remaining models is not dramatically diversified, implying that any 
comparison is difficult. For men, this is certainly due to the very high 
mortality rates above age 105, which produce large deviations. For women, 
the death rates increase more regularly with age and some systematic patterns 
can seen in the different SSRs. However, some of the remaining functions fit 
the (mainly female) data better than others. These are the two polynomials, the 
Perks' and Beard's logistic functions and the exponential-quadratic (C-K) 
model. 

Considering only the SSR does not, however, seem to be sufficient to evaluate 
the usefulness of each particular model. Therefore, in addition to the SSR, two 
more criteria were applied: 

• a relative deviation between the empirical and model-implied deaths, and 

• a relative deviation between the empirical and model-implied life 
expectancy 

These two indexes were chosen mainly for practical reasons, namely the fact 
that they are commonly used in statistical practice of mortality and population 
projections. Moreover, an acceptable percentage deviation can be easily 
defined for relative measures. It is important, however, to stress that the above 
criteria are secondary rather than primary because a total deviation close to 
zero does not necessarily mean a close fit of a function to data, but could also 
be a compensation effect of equally large negative and positive deviations. 
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Table 6.2 shows that for the five good-fit functions (two polynomials, Perks' 
and Beard's logistic functions, and the exponential-quadratic model) the 
relative deviations are almost all below one per cent, whereas for the three 
functions with the worst fit, the deviations amount to four to eight per cent. 
This is a clear confirmation of the results based on the values of the SSRs. 

There is less consistency when aU three criteria (the SSR and the two relative 
deviations) are viewed within the framework of the two estimation methods. 
Sometimes a function fits data well with one method of weights, but not with 
the other. As we have no intention of selecting any of the criteria, we have 
assumed that each set of weights should produce the same result. Thus, a 
divergence of results obtained from the two methods was seen as evidence of 
inappropriateness of a function to the given data set. As a measure of the 

divergence, we used the following index: ^x) Table 6.2). 

X 

The table shows that three functions are characterised by an extremely small 
index of divergence for both sexes, namely the two polynomials and the 
exponential-quadratic model. For women, the 2-parameter logistic (H-P-C) 
function behaves almost equally well, though the fit in terms of the SSR is not 
as good as for the three other functions. The reasons for the different 
performance of the investigated functions can be more precisely explained on 
the basis of the specificity of the two estimation methods, i.e. when the fit is 
analysed at each particular age. The highly consistent C-K function and rather 
inconsistent H-P-C model are good examples (see Figure 6.4). 

The formulation of the C-K model is flexible enough to approach the age 
schedule at any age and in any estimation approach quite well. The deviations 
between the observed and model-implied values of death rates are always low 
in this model. This is not the case with the H-P-C function. This model 
ignores, by definition, the high mortality rates in the tail of the age curve. This 
implies that the SSR in estimation method 1 is higher than in comparative 2- 
parameter models, in particular in the C-K model. This feature is even more 
dominant when estimation method 2 is applied, as this method favours (by 
definition) lower ages. Method 2 ensures, however, that up to the age of 99 
years the fit is almost perfect. In effect, a rapid improvement is seen in the fit 
of this particular model when method 2 is used. The divergence index for the 
H-P-C model is high. 




Table 6. 2. Goodness of fit of 11 models for old-age mortality in four countries, 1 950-1 994. 

Fit interval 80-109 years 
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Figure 6.4. Three functions employed in modelling old-age mortality. 
Pooled data from four countries, 1950-1994. 

Fit interval 80-109 years 

Heligman-Pollard model 



Weights method 1 



Weights method 2 





Coale-Kisker model 





Himes-Preston-Condran model 
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Mortality data for each country separately make it virtually impossible to fit 
the models up to age 109 (see Figure 6.2). The calculations have therefore 
been made for mortality at ages from 80 to 104 years. The results given in 
Table 6.3 show that the conclusions formulated earlier about the best functions 
are also valid for each of the four countries separately. 

In order to determine which models would be the best choice for each of the 
four countries, we decided to focus on the group of 3-parameter models 
(which are flexible enough to fit old-age mortality data in any country) and on 
estimation method 2 (which is more robust than method 1). For France, the 
second-degree polynomial (POLY3) can be recommended for both men and 
women. For Italy, Beard's model is most suitable for men and Coale and 
Kisker's for women. The C-K model is also the best choice for Dutch men and 
women. For men in Norway, Beard's function and the second-degree 
polynomial are equally good. Finally, for Norwegian women the C-K model is 
recommended. 



6.5 I Extrapolation of the Age Pattern seen from the Perspective of 14 
Models 

A complete age pattern of mortality of the oldest old cannot always be 
identified empirically. This is related to the fact that data for this population 
group tend to be more deficient than for other age groups. Mortality of the 
very old can, however, be approximated reasonably well without using the 
poor-quality empirical data. A simple procedure for such an approximation is 
the extrapolation of old-age mortality (usually at ages 85 and older) from high- 
quality empirical indicators of mortality in the younger population groups. In 
the official statistics of many (mainly less-developed) countries, the ultimate 
age of reliable data is 85 years. Higher ages are often ignored by showing life 
table and/or other mortality indexes for an aggregated age 85 -i- as the last age 
interval. This creates a necessity to estimate the missing or unreliable mortality 
indexes for at least 15 up to 25 following years of age. In this section, analyses 
are presented in which several functions are tested from the point of view of 
their performance in the age extrapolation of mortality. The number of 
functions is extended up to 14 by including higher-degree polynomials, i.e. 
polynomials with five, six and seven parameters. The empirical mortality rates 
used to estimate these functions cover ages from 60 to 84 years. Three 




Table 6.3a. Sum of squared residuals in 11 models for old-age mortality, by countries, 1950-94. 

Fit interval 80-104 years 
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Table 6.3b. Sum of squared residuals in 11 models for old-age mortality, by countries, 1950-94. 

Fit interval 80-104 years 
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Table 6.4a. Fit and extrapolation goodness of 14 models for old-age mortality, three countries, 1950-1994. 

Fit interval 60-84 years 
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Table 6.4b. Fit and extrapolation goodness of 14 models for old-age mortality, three countries, 1950-1994. 
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countries are considered: France, the Netherlands and Norway. Italy has been 
excluded because of data problems. 

Both previously described methods of weights are used to fit the 14 functions. 
However, the goodness of fit as measured by the SSR over the fit interval (i.e. 
ages 60-84) is not of interest here. A close fit up to the age of 84 years can be 
associated with poor extrapolation above this age. In order to evaluate the 
functions in terms of their goodness of extrapolation three other criteria are 
used (see), namely relative deviations from empirical (thus, “tme”) deaths, 
(“tme”) life expectancy and extrapolation divergence. The definitions of the 
three criteria for the goodness of extrapolation are the same as those in the 
preceding section, except that the calculation is made over the extrapolation 
interval rather than the fitting interval. 

The extrapolated life expectancy at age 85 is perhaps the most interesting 
outcome, especially that the age 85 years and over usually ends abridged life 
tables. In terms of life expectancy at age 85 the Gompertz function, the 
usefulness of which has recently been questioned, gives quite satisfactory (for 
women) or even excellent (for men) results. Unfortunately, these results 
cannot be considered valid because they are produced at a very poor fit above 
age 85 (see Figure 6.5). This is shown by the index of extrapolation 
divergence, calculated as a weighted SSR over the extrapolation interval (see 
Table 6.4; weights method 2). Though the value of this index depends on the 
weighting method used, in most cases both methods produce si mil ar results. 

Table 6.4 shows that the best approximation of tme life expectancy, together 
with a very close fit over the extrapolation interval, may be obtained for both 
sexes using polynomials with five or six parameters. It is worm noting that 
these functions also show an excellent fit over the age interval 60-84. Almost 
equally good results, but only for men, are obtained with the Weibull 3- 
parameter function. For women (and only women), two other functions might 
be suggested as good approximations of true deaths and life expectancy: 
Heligman-Pollard and Perks. However, they extrapolate satisfactorily only up 
to the age of 95 years. 
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Figure 6.5. Extrapolation of mortality beyond age 85 resulting from selected models. 
Pooled data from three countries, 1950-1994. 
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The above considerations are valid provided that the empirical age pattern of 
mortality is similar to that under study. For another pattern, the functions that 
yield good results might be different. It seems that a more universal conclusion 
can be formulated about functions that do not produce good extrapolations. 
Here, the 3-parameter polynomial and the 2-parameter Weibull function must 
be mentioned, both of which have the worst fit below as well as above age 85 
and the largest deviations from 'true' life expectancy. Almost equally 
inconsistent over the extrapolation interval is the exponential-quadratic C-K 
function, though it fits very well in the age interval 60-84. 



6.6 I Discussion 

Most discussions about old-age mortality have focused on its trends, levels and 
age patterns. Two aspects are most often discussed: age patterns and trends in 
mortality at selected ages over time. It is a fact that age-specific death rates 
have declined over time, even at ages as high as 107-109 years, the highest 
ages at which mortality can be measured empirically at present (Kannisto, 
1995). However, there is little agreement among researchers with respect to 
the age patterns of old-age mortality. Some researchers suggest that mortality 
increases exponentially with age, following the pattern of a Gompertz curve. 
Others state that human death rates increase exponentially with age from 
approximately 30-35 years to 80-85 years (e.g. Horiuchi and Coale, 1990). At 
more advanced ages, the rate of increase slows down and mortality no longer 
reflects the pattern given in the Gompertz curve. This phenomenon is termed a 
levelling off of mortality and sometimes takes on the form of a plateau-like 
flattening in the tail of empirical age curves (see the last references for 
examples). A flattening of this sort is not always clearly visible, even when the 
exponential curve levels off. This can, however, be shown analytically using 
theoretical models of mortality. When fitting theoretical models to empirical 
age-specific death rates, such as the Gompertz-Makeham and the Gamma- 
Makeham models, the fit of the former is considerably lower than that of the 
latter (Yashin et al., 1993). The phenomenon of levelling off is considered to 
be a statistical artefact by some researchers, which may be related to 
inadequate measurements of mortality at older ages. Since mortality at older 
ages, compared with younger ages, increases more rapidly with age, 
measuring old-age mortality in one-year time intervals may be inappropriate. It 
has therefore been suggested that an optimal length of such an interval be 
defined statistically. Finally, some researchers interpret the levelling off as 
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evidence supporting the paradigm of the unlimited life span of humans in the 
long term. 

The aim of our study was to present a comprehensive overview of model age 
schedules of mortality at extreme ages. The motivation of this chapter was, 
however, more practical than theoretical. Neither an improved measurement 
of mortality of the extreme old nor questions about the viabihty/non-viability 
of the levelling off of mortality have been addressed here. We simply focused 
on some recently proposed models expected to appropriately fit the death rates 
at ages after 85 and investigated which perform better than others, and why. 
Thus, our modelling can be said to be data-driven. An analysis of old-age 
patterns of mortality can, however, be approached in various ways. Other 
authors have presented theory-driven models, as for in stance those discussed 
by Yashin in Chapter 11 of this volume. Another attempt, made recently by 
Alho and Nyblom (1997), employed the concept of mixed estimation (Theil 
and Goldberger, 1961) to study the 1980-93 old-age mortality in Finland. In 
the mixed estimation method, as in the empirical Bayes approach, empirical 
data and auxiliary information are used to estimate statistical models. The 
auxiliary information consists of 'extra' observations on mortality obtained 
either from explanatory models or from existing (international or national) 
high-quality data sources. The auxiliary information is often available as a 
single data point (e.g. the death rate at age 110 years) that serves as a target 
value for a particular age. The Alho and Nyblom method of incorporating 
extra information is worth mentioning as a further extension of data-driven 
studies such as ours. 

Alho and Nyblom showed that the so-called generalised mixed estimator 
(GME) can be (approximately) expressed as a weighted average of the 
empirical data estimator and the auxiliary estimator. They extended the GME 
to the maximum likelihood estimation of the parameters of the logit and 
Poisson distributions that are directly applicable in the analysis of old-age 
mortality when the generalised l in ear models are considered. 

Using the Kannisto collection of international data on mortality at old age, 
Alho and Nyblom specified a (sex-specific) target for mortality at age 110. 
The Kannisto target appeared to be different (in particular for men) from that 
proposed originally by Coale and Kisker (1989). Both these targets were used 
in the mixed estimation of Einnish mortality in 1980-93. Eor females, whose 
targets were quite similar, the three estimated curves were very si mi lar (the 
curve from the non-mixed estimation versus the two curves from the mixed 
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target estimation). For men, the targets were quite different, resulting in two 
distinctive curves obtained in the mixed estimation. Based on the fit. Coale and 
Kisker's target was suggested to be more credible for men in Finland. In this 
model the fit was si mi lar to that of the non-mixed estimation. Moreover, in the 
mixed estimation, all the coefficients considerably stabilised. Finally, the use 
of targets ensured that the male-female crossover, seen in Finnish mortality 
after the usual non-mixed estimation, was eliminated. 

The focus of our chapter was on the performance of various models when 
applied to high-quality data for the very old. Two approaches to old age 
patterns of mortality were discussed. In the first approach, we aimed to 
determine the best function to fit the mortality pattern at ages above 80, with 
particular emphasis on ages around and above 100 years. In the second 
approach we applied a theoretical age schedule of mortality for ages above 85 
years, having estimated its parameters using empirical data for younger ages 
only (60-84 years). In conclusion, two issues deserve special attention when 
attempting to fit a mathematical function to the oldest-old mortality data: 

• Compared with more complex models, functions with two parameters, such 
as those of Gompertz, Weibull, Himes-Preston-Condran and Heligman- 
Pollard, performed poorly in both approaches. Generally, the longer the 
age area studied, the more parameters are needed. However, in the future, 
age patterns of mortality of the oldest old may become more regular even at 
extreme ages. This in turn could imply that parsimonious 2-parameter 
formulations could be successfully used. 

• In modelling old-age mortality, it is vital that high-quality empirical data for 
ages as high as at least 85 years are used. It is especially dangerous to use 
data series tmncated at age 75. 

The above conclusions are based on properties of the empirical age curve of 
mortality that are common to most of the populations. Mortality as a function 
of age increases at different rates in particular age intervals. The most 
important feature is an acceleration of the age-related increase in mortality at 
ages approaching 75 years and a deceleration thereafter. Thus, methods that 
extrapolate mortality using data below age 75 are doomed to fail. 

More detailed conclusions regarding the best possible method of modelling 
mortality age patterns of the oldest old differ in each of the two approaches 
considered. It is important to note that the best models recommended are 
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always consistent with the empirical rate of change in mortality with age (see 
Figure 6.3): 

• In the case of reliable mortality data for ages above 85 years, the best 
choice appears to be the Coale-Kisker exponential-quadratic model. Four 
other functions (POLY3, POLY4, Perks and Beard) are definitely worth 
mentioning. The exponential-quadratic function proved to be one of the best 
models for the age interval 80-109. An additional advantage of this method 
is the simplicity of the original calculations, as designed by the authors, 
where one of the two parameters can be calculated directly from data and 
the second is obtained using the assumption of a fixed limit value for age 
110 years. These practical options make the method relatively safe. 

• When mortality data above age 85 are not reliable and satisfactory data 
exist for ages 60-84 years, then the best choice is extrapolation of the age 
pattern using a polynomial with five or six parameters and, importantly, 
fitting it using the weighted least squares method. 
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Abstract 

This chapter presents forecasts up to 2020 of overall mortality by period and 
cohort, and of all-cause mortality obtained by aggregation from cause-specific 
models (by period) for four developed European countries (France, Italy, the 
Netherlands and Norway). Dynamic parameterisation models and trend 
extrapolation are applied to obtain the forecasts. The results show that the 
three approaches produce different levels of mortality, and consequently of life 
expectancy in the future. For women, the all-cause approach tends to result in 
higher life expectancies than the cause-specific approach. For men in France, 
Italy and the Netherlands, the all-cause approach generally resulted in lower 
life expectancies by 2020 than cause-specific forecasts, again noted for ages 
40, 60 and 80. For Norwegian men, the overall mortality approach produced 
higher life expectancies than cause-specific forecasts. Forecasts of mortality by 
cohort are the lowest (in terms of life expectancy) of the three. When 
comparing the three approaches, criteria such as forecast accuracy, 
transparency, utility and options for validation should be taken into account. It 
is suggested, however, that all three approaches are useful and needed. So, 
choosing among the various approaches is needed only if particular constraints 
prevent researchers from performing all three types of forecasts. 
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7,1 I Introduction 

The major objective of our study is to give indications about the usefulness of 
alternative methodologies for mortality forecasting, in particular those which 
take advantage of cohort and cause-of-death dimensions. An international 
perspective of four European countries (France, Italy, the Netherlands and 
Norway) is selected as the geographic frame of analysis. The time dimension 
chosen are the years 1950-1994 and cohorts bom in the 19* and 20* centuries. 
Our forecasts are designed as an experiment in which three alternative 
forecasts are made to evaluate the impact of different background factors on 
future trends in mortality. Our research focused on producing forecasts based 
on (a) best statistical models (least squares criterion), (b) extrapolation of 
trends, and (c) general expectations about future trends. Three approaches 
were tested: overall mortality over time, overall mortality by birth cohorts, 
cause of death-specific mortality over time. 

A central focus of the study was to distinguish patterns in the three forecast 
outcomes. Differences were anticipated and found. This prompted a second 
question: if one had to choose between the different approaches, which criteria 
would one have to apply, and how? 

In Section 7.2 we first briefly sketch the scientific framework of the study. 
Sections 7.3 and 7.4 discuss data sources and the statistical method applied. 
Section 7.5 is devoted to problems related to fitting the models to the observed 
data and problems of forecasting. The results are shown in Section 7.6, and 
Section 7.7 contains a summary and discussion. 



7.2 I The Effects of Age, Period, Cohort and Cause of Death in 
Forecasting Mortality 

Age, period, cohort, and cause of death are basic dimensions used in 
prediction of mortality to improve the understanding of mortality change. The 
relative importance of these basic dimensions, alongside forecasting accuracy, 
play a predominant role in the selection of prediction approaches. In this 
section we discuss the meaning of the effects of age, period, cohort, and cause 
of death in forecasting mortality. Further on, we review recent knowledge 
about patterns in mortality forecasts obtained from different approaches and 
about possible reasons underlying the preference for alternative forecasts. 
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Mortality, like the other components of population change, fertility and 
migration, is affected by age. In all societies, death rates are higher among the 
youngest and oldest groups. Age is interpreted as a point in the cohort life 
cycle. The influence of age can manifest itself through biological, social and 
cultural factors. The factors are largely cohort-specific, somtimes however, 
their impact is investigated in the course of calender time. This is because it is 
not always possible to wait until a real birth cohort completes its experience, in 
particular in the case of mortality. In such cases, summary information of 
mortality over a short period of calendar time is used to obtain the required 
information. This observation provides a measure of mortality in a synthetic 
cohort. Note that age-specific mortality of a synthetic cohort is influenced by 
cohort life histories of many birth cohorts. Mortality in the synthetic cohort 
depends on the past experiences of these cohorts. On the other hand, it also 
depends on the factors at play in the period of interest. While the real cohort 
approach focuses on past and present factual experiences of a particular group 
of people, the synthetic cohort approach focuses on answering questions about 
present-day mortality of many such groups, questions which are conditional on 
certain assumptions. Therefore, it offers a model for future experience. 

Causes of death tell us on the other hand about biological mecha ni sms of 
disease and mortality. Trends in mortality by cause can be linked to trends in 
physiological and behavioural risk factors, which is useful in formulating 
hypotheses for the future. Importantly, changes in mortality by cause result, in 
turn, from changes in the morbidity process in a population. The mortality 
level, and sex and age distributions are all influenced by trends in cause of 
death specific mortality and morbidity which are therefore important 
forecasting inputs. 

Whereas analysing period and cohort effects belongs to the tradition of 
mortality forecasting, decomposition of mortality by cause of death has 
received less attention from forecasters. Despite the fact that period and cohort 
approaches are most commonly used, comparisons of their outcomes have 
hardly been made. Differences in the outcomes and accuracy of overall 
mortality and cause of death-based forecasts of total mortality, on the other 
hand, have been recently discussed at length by Wilmoth (1995), Caselli 
(1996) and Alho (1991). 

Wilmoth (1995) showed analytically for the proportional rate of change 
models (models linear in parameters) that forecasts of mortality obtained from 
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the aggregation of mortality by cause must eventually exceed forecasts based 
on mortality data by aU causes taken together (i.e. no decomposition by 
cause). Using these particular models and mortality data from Japan (1951- 
90), he demonstrated that the cause-of-death-approach resulted in higher 
mortality (lower life expectancy) than that of total mortality. CaseUi (1996), 
too, prepared cause of death-specific projections of mortality for a number of 
European countries using (non-linear) age-period-cohort models with an 
analytical formula expressing the trend in period effects. Data used in her 
study covered seven causes of death and ages 60 to 85-i- in the period 1950- 
1985. From Caselli's projections, another pattern of results has emerged. In 
general, projections of trends in mortality by cause in all countries reduced the 
advantage enjoyed by females and increased that for men, for whom the gains 
in life expectancy were larger when estimated by cause than those obtained 
from total mortality-based projections. For Dutch men, for example, the 
increase at age 60 was 0.9 years greater in the cause-of-death approach than 
that predicted without taking into account differential trends in mortality by 
cause. For Dutch women, the cause-of-death models gave an estimated 
increase in survival at age 60 that was 1.4 years lower than the average 
increase estimated using trends in total mortality. 

Alho (1991) discussed the effects of aggregation on the estimation of trends in 
mortality. In some cases, Alho suggests that the aggregate forecasts appear to 
be more credible. He states that several factors are capable of influencing the 
potential for disaggregation-related gains in forecast accuracy. The effects of 
misclassification of deaths by cause, the cross-correlation between causes, the 
similarity of auto-correlations in different causes, modelling bias, and expert 
judgement are all examples of such factors. The results of different 
(aggregated vs. disaggregated) approaches are not similar if one or more 
causes serve as 'leading indicators' for the remaining causes, or outside 
information is incorporated into forecasting either through expert judgement or 
formal statistical modelling. Also, under highly non-linear models or in the 
presence of modelling errors the results may not be similar. 

Summing up, the major focus of these three recent studies was on differences 
in the mortality outcomes between cause-of-death and total mortality 
approaches. It is clear from the results presented that the general pattern of 
pessimism (i.e. higher mortality) in cause-specific mortality forecasts appears 
only under certain conditions (forecasts obtained as trend extrapolation from 
linear, or l in ear in parameters, forecasting models). In all other situations. 
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compared with the overall mortality approach, both higher and lower 
predictions of total mortality can be obtained from decomposition by cause. 
An explanation of the differences in the outcomes has been proposed by Alho 
in terms of sources of bias in the cause of death-specific models. Alho 
concluded that in some cases more aggregate total mortality forecasting models 
might be more accurate statistically. 

Comparing different forecasting methods by investigating only the differences 
in the forecast outcomes and accuracy would, however, be a poor evaluation 
approach. Rogers'^ (1995; p. 200-1) clearly points out that “model 
performance is a multi-faced concept that involves much more than forecasting 
accuracy alone. (...) Additional attributes such as transparency, utility and face 
validity, all play an important role in the presentation of population forecasts”. 
This explains why the question about the degree of complexity/simplicity of 
forecasting models cannot be answered unconditionally. In addition, “whether 
simple forecasting models outperform complex models is an empirical issue 
that depends on the particular historical period observed and the degree of 
demographic variability exhibited during this period” (ibid., p. 200). Forecasts 
based on simple trend extrapolation of life expectancy at birth are often better 
in the short term than forecasts obtained from a complex model for mortality 
disaggregated by age and cause of death. However, in long-term forecasts the 
latter may be better than the first. 

Having taken the above statements as a starting point, we compared the 
forecasts up to 2020 obtained from extrapolation of trends in overall mortality 
by period and cohort, or trends in cause of death-specific mortality (by period) 
for four countries. Age and sex distributions of mortality were part of all 
forecasting models. In this chapter we show that considerable differences exist 
in the outcomes of the three approaches. Some patterns were found in the 
outcomes. They cannot, however, be generalised in the way Wilmoth 
suggested for linear models. If a choice had to be made between the three 
approaches, the reasons underlying the choice would have to be in line with 



Despite the fact that the remarks by Rogers and the whole discussion of simplicity 
versus complexity in forecasting models presented in special issues of Mathematical 
Population Studies 5(3), 1995, and International Journal of Forecasting 8(3), 1992, 
were all made in the context of population forecasting, their relevance to forecasting 
any demographic process, including mortality, is indisputable. These two issues offer 
a wealth of useful material about forecasting in general. 
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the suggestions made by Rogers. We shall explain this in more detail in the 
last section of this chapter. 



7.3 I Data Types, Sources and Quality 

The design of this research implied that for each of the four countries (Italy, 
France, the Netherlands and Norway) three types of data were required: 
historical (i.e. dating as far back as at least the mid-19* century), data on 
overall mortality in the form of (single-year) age-period-cohort numbers of 
deaths and the respective (single-year) population age structures, period data 
on mortality by cause of death with as detailed as possible an age 
classification, and standard period (single-age) overall mortality data for the 
years from 1950 up to the present. All data were sex-specific. We also 
incorporated the 'tail' of the mortality age pattern, i.e. mortality of those aged 
80 and over up to the last survivor, to the standard data. These data covered 
the years from 1950 onwards, as they are only reliable for the most recent 
period. (Age confirmation is only possible for people bom from the mid- 19* 
century onwards, when birth registration systems started to become 
operational). 

The choice of causes of death was based on two considerations: the scientific 
requirements of the analysis and deficiencies in the existing data. From a 
scientific point of view, only leading causes of death with clear age patterns 
and trends and known, predictable risk factors should be included in 
projections. Mortality from less significant diseases usually shows many 
irregularities. Including such causes in the analysis would deteriorate rather 
than improve projections. This served as a guideline when making our 
decision on the number and type of causes of death. The following list of ten 
causes of death (coded according to ICD-9) was selected for our projections: 
stomach cancer (151), trachea/bronchial/lung cancer (162-163), breast cancer 
(females; 174), prostate cancer (185), coronary heart disease (410-414), 
cerebrovascular disease (430-438), pneumonia and influenza (480-486, 487), 
chronic lower respiratory diseases (CLRD; 490-494, 496), external causes 
(E800-E848, E880-E888, E950-E959), and remaining causes. Several 

problems related to the data were encountered. These are described in detail 
elsewhere (Tabeau et al., 1998), along with how they were overcome. 
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7.4 I Statistical Method used 

The statistical method applied in this study is based on the principles of 
parameterisation. Our parameterisation models were, however, estimated on 
the basis of age- and time-specific death rates and included parameters 
specified as functions of time (Tabeau and Tabeau, 1995). This can be 
expressed more formally in the way given in Equations 7.1 and 7.2. Whereas 
a static (i.e. base) parameterisation model is the following: 

f,(x) = f(x,Pi,...,Pk, e(x)) (7. 1) 

where X is age, Pj to are (base) parameters to be estimated from a set of 
annual age-specific death rates (f/x)), and e(x) is the error term, the dynamic 
function is given below: 

fd (x> t) = f(x, Pi(t ), ... , Pi, (t), e(x,t)) (7.2) 

The t denotes the time variable and Pi(t) to are trend functions which 
have replaced the parameters Pj to P^. The specification of the dynamic model 
is always based on the static base function selected to describe the age profile 
of a process in a single period of time. The dynamic nature of the model is 
obtained by removing time-independent parameters from the static single-year 
function and replacing them by time-dependent ones in the form of trend 
functions. The trend functions can be set as l in ear and/or nonlinear. Their final 
form depends on how the dynamic model fits the data and what predictive 
properties it has. Forecasting in the dynamic model is only based on two 
regressors: age and time. 

The model can be estimated iteratively by applying a nonlinear least squares 
method. (A useful proposition of an estimation method in terms of weighted 
least squares is also available in Chapter 3 by Heathcote and Higgins). The 
obtained estimator of the vector of parameters of the model tends to be biased 
and usually has an unknown distribution, even if a specific distribution is 
assumed for the error term of the model. However, under suitable conditions, 
it is consistent and asymptotically normally distributed (e.g. Judge et al., 1985 
and Granger and Newbold, 1977). The consistent estimate of the variance of 
the error term can be computed as well, and its asymptotic distribution can be 
derived. The practical consequence of this is that, under certain assumptions, 
all the results for the linear regression are asymptotically valid for the 
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nonl in ear regression model. Therefore, statistical i nf erence and hypotheses 
testing can proceed in the same fashion as for the linear model. But the 
common measure of fit, i.e. the coefficient of multiple correlation R^, is no 
longer guaranteed to be in the range of zero and one. The sum of squared 
residuals (SSR) and other measures give better indications. 

The best forecasting models can be selected by evaluating the sum of squared 
residuals, parsimony of model specification, significance of the parameters, 
and the visual fit (on log as well as on normal scale). An analysis of ex-post 
forecast errors can be also completed. 



7.5 I Fitting and Forecasting: selected Issues 

Parameterisation with time-dependent parameters is a flexible tool for fitting 
and forecasting of mortality. Problems of fitting and forecasting are 
inseparably connected with each other. They are discussed in this section using 
two examples of practical issues: modelling of mortality by cause of death, and 
the role of alternative hypotheses for old-age mortality. The main results of the 
study are reviewed in Section 7.6. 



7.5.1. Modelling of Mortality by Cause of Death: an Example 

In order to make projections of overall mortality from mortality by cause of 
death, models for mortality by cause are required. The sum of cause-specific 
forecasts is the forecast of all-cause overall mortality. For cause-specific 
mortality in four countries eighty parameterisation models were applied. 
Details of modelling of mortality by cause of death are discussed elsewhere 
(Tabeau et al., 1998). In this section, we give an example of modelling of lung 
cancer mortality in the Netherlands and summarise the study's conclusions. 

For Dutch males, trachea/bronchial/lung cancer (shortly: lung cancer; ICD 
codes: 162, 163) was the second leading cause of death in 1993, with the age- 
standardised mortality rate (SMR) for ages 40 and over equal to about 93.8 
deaths per 100,000 people. The trend in SMRs increased from 1950 
(SMR40-b for 1950: 27.8) to the early 1980s (SMR40-b in 1980: 107.9), 
remained stable until 1988 (106.9), and started to decrease from 1989 
onwards. The maximum SMR40-I- of 108.7 was reached in 1986. As a result. 
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a turning point took place in mortality from lung cancer in the second half of 
the 1980s, and this trend is expected to continue in the coming years. 

The age pattern of (male) mortality from lung cancer is easily fitted by the (left 
tail of the) log-normal function {Figure 7.1). The trend observed in the SMRs 
{Figure 7.2) was mirrored in the annual estimates of the parameter PI, the 
severity of the theoretical curve {Figure 7.3). Note that in the most recent 
years more irregularities were observed in the empirical age curve. At the 
same time, the maximum of the curve moved to higher ages, as shown by the 
estimates of P3, the location of the age curve, which shifted from about 68 
years in 1950 to 80 years in 1990. The width of the age curve increased during 
the observation period. These remarks suggest that (male) mortality from lung 
cancer has persistent cohort patterns. The cohorts bom at the end of the 19* 
century and around 1900 clearly have the highest death risks. This confirms 
the hypothesis about the association between smoking and mortality, since the 
above-mentioned cohorts are known as the first heavy smokers. 

As a cause of death, lung cancer is far less significant for Dutch females who, 
for ages 40 and over, had an SMR from lung cancer as high as about 17.4 per 
100,000 in 1991. Age-standardised mortality has increased more rapidly since 
1970 (SMR for 1970: 6.7; for 1952: 5.3), and most probably it will continue 
to increase in the future. It is interesting to note that age patterns of mortality 
from lung cancer are different for males and females, as shown by the 
empirical age curves and confirmed by fitting the log-normal function to the 
data for females. Although all parameters are significant at a satisfactory level, 
the low fit is unacceptable for the period 1970-1990. A considerable 



Figure 7.1. Forecast of mortality from lung cancer: age patterns for Dutch men 
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Figure 7.2. Empirical and forecasted SMRs Lung cancer, Dutch men, age 40+ 
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improvement in fit was obtained from a three-component function, i.e. the 
sum of a constant and two three-parameter log-normal components. Parameter 
estimates for females are rather irregular, but trends in parameters are 
relatively stable. Irregularities in the parameters are caused by parameter 
correlations. The empirical curves clearly show that, as with males, female 
lung cancer mortality also has a (slight) cohort dimension. Namely, whereas 
the real maximum is always located at ages above 80 years, since 1980 there 
has been another hump at ages lower than 80. This hump was located at the 
age of about 50 years in 1980 and systematically shifted to higher ages in the 
profiles for 1984 and 1988. Summing up, female cohorts bom around 1930 
show higher death risks from lung cancer than younger or older cohorts. 

The projection model for lung cancer mortality for males was based on the 
log-normal base function: 

log m(x, t) = Pl(t) * exp [-P2(t) * (\og x - log P3(t) f ] + e(x, t) (7.3) 

Trend functions of the base parameters were specified in the following way: 

PI (t) : log-normal plus a constant: 

Pl(t)=Pll+P12 exp(-P13*(log(x)-log(P14)f) 

P2(t ) : linear decline: P2(t) — P21 +P22*t 
P3(t): linear increase; P3(t) - P31 +P32*t 
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Figure 7.3. Static and dynamic estimates of base parameters. Mortality from lung 

cancer, Dutch men 

Base parameter P1 for lung cancer 
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The model for men included eight parameters which were estimated jointly 
from (several thousands of) age- and time-specific observations of (log) rates 
of mortality from lung cancer. The projection model for females was 
formulated as follows: 

log m(x, t) = PI + P2(t)* exp/-Pi * (logx - logP4 fj + P5* 
expf -P6 * (logx - logP7 /] + e(x, t) 

with the hyperbolic increase trend in P2 (P2(t) = P21 +P22/(P23+t)), and the 
remaining parameters constant over time. 

The goodness of fit was high in both models. The coefficient for males was 
.9780 (SSR= 173.39) and for females .9727 (73.49). All parameters were 
perfectly significant. The projection obtained from the model for men is shown 
in Figures 7.1 to 7.3. Projected trends in age-standardised mortality are a 
continuation of those observed since 1980. For males, this gave a systematic 
rapid decrease from the observed level of 93.8 per 100,000 in 1993 to about 
44.9 per 100,000 in 2020. For females, a constant increase was projected, 
from 17.4 in 1991 to 24.6 per 100,000 in 2020. The increase for females will 
be about 41 per cent. 



7.5.2. Alternative Assumptions in Forecasting Old-Age Mortality 



Old-age mortality is the key component responsible for changes in mortality in 
the future. Modelling mortality of the oldest-old cannot take place within the 
framework of the cause-specific approach, in view of the multiple causes of 
death of this population group. Overall mortality was modelled for periods as 
well as for cohorts using a modified Gompertz function as in the Heligman- 
Pollard (H-P) model (Heligman and Pollard, 1980). Mortality rates by single 
age groups was the dependent variable: 



m(x,t) = 



P7(t)*P8(tf 
l + P7(t)*P8(tf 



+ e(x,t) 



(7.5) 



where m(x,t) is the mortality rate at age x in the year/cohort t, and P7(t), P8(t) 
are time-dependent parameters. Note that the values and interpretations of 
parameters P7 and P8 are the same as in the original Gompertz. P7 is the 
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ini tial level of senescent mortality and P8 the increase in old-age mortality per 
one year of age. 

In many cases, several models can be fitted to data of one type almost equally 
weU (e.g. overall mortality by period). These models are usually similar, also 
in terms of forecasts. However, examples exist of models with comparable 
goodness of fit and yet very different forecasts. The major difference between 
the alternative models is related to the way in which the future is modelled, 
i.e. hypotheses used for extrapolating past trends into the future. Alternative 
specifications of hypotheses can be formulated in any of the three modelling 
approaches. Differences resulting from the various hypotheses applied are 
most pronounced when overall mortality is modelled. An example of this is 
given below for overall period mortality of Dutch men. 

For Dutch men, two alternative extrapolations were almost equally good in 
statistical terms {Table 7.1; figures in bold). Life expectancy at birth from the 
first model is 75.4 years at birth and from the second 82.2 years. The 
assumption used for old-age mortality in the first model was that the general 
level of old-age mortality of Dutch men will gradually decline but at a lower 
pace than the decline observed in the past seven to eight years. The increase in 
mortality with age will continue to rise, as observed recently, albeit less 
rapidly than in the past seven to eight years. The assumptions used for the 
second model were different with respect to the decline in the general 
mortality level. In this case, the sharp decline observed in the past 7-8 years 
was assumed to continue in the future. Consequently, life expectancy at 40 is 
equal in these two models, at 36.2 and 41.1 years, respectively. It is thus clear 
that the method of modelling old-age mortality has a strong influence on the 
fin al outcome of the projection. 

The two different ways of modelling old-age mortality are the source of huge 
differences in forecasted survival in the future. The first model was selected 
rather than the second in view of its statistical advantage. However, because 
the statistical difference between the two models is not very large, a question 
stiU remains open: is the selected model the most reliable one? 

In general, historical trends in the Gompertz parameters are useful when 
identifying the mecha ni sms of past reductions in old-age mortality. Possible 
variants of changes in the Gompertz parameters over time and cohorts are 
summarised in Table 7.2. A number of specifications can be incorporated into 
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Table 7.1. Alternative models for mortality of Dutch men 



Models 


Specification of 


e(0) 


No. of parameters 


Sum of 




P7 


P8 


Total 


Non- 

Signific. 


squared 

residuals 


MO 


hyp/t 


hyp/t 


75.1 


14 


3 


85.55 


0.995 


Ml 


h>p/t 


h>p/t 


75.4 


12 


- 


81.89 


0.995 


M2 


hyp 


hyp 


74.9 


18 


9 


81.65 


0.995 


M3 


lin 


lin 


86 


14 


1 


82.99 


0.995 


M4 


hyp/t 


lin 


72.7 


13 


- 


82.23 


0.995 


M5 


log-nor 


log-nor 


74.3 


16 


2 


81.19 


0.995 


M6 


log-nor 


lin 


75 


15 


2 


81.92 


0.995 


M7 


lin 


log-nor 


89.8 


15 


2 


81.72 


0.995 


M8 


lin 


hyp/t 


82.2 


13 


- 


83.40 


0.995 


M9 


lin/fix 


hyp/t 


76.3 


12 


4 


440.49 


0.974 



Notes: “hyp” - hyperbolic, “lin” - linear, “log-nor” - log-normal, “t” - the presence of target 

parameters, and “fix” - fixed parameters. The target and fixed parameters are defined 
a priori and not estimated. 



Table 7.2. Overview of trends in the Gompertz parameters used for our projections 





P7: 

Mortality 

level 


P8: 

Increase 
with age 


e(40): 

Anticipated 

change 


Assumptions 
applied 
by period 


Assumptions 
applied 
by cohort 


1 


Stable 


Stable 


Stable 


- 


- 


2 


Stable 


Increase 


Decline 


- 


- 


3 


Stable 


Decline 


Increase 


Fr-f, NIJ, No_f 


- 


4 


Increase 


Stable 


Decline 


- 


- 


5 


Decline 


Stable 


Increase 


Fr_m, No_m, 

iLf 


Fr_f/m, lt_f/m, 
Nl_f/m, No_f/m 


6 


Decline 


Increase 


Unknown 


Nl_m, It_m 


- 


7 


Decline 


Decline 


Increase 


- 


- 



trends in the Gompertz parameters. Three specifications, nos. 1,2 and 4, were 
not considered realistic for developed countries as they imply a decline or a 
stabilisation in life expectancy in the future. Another one, no. 6, although 
plausible, involves difficulties in the anticipation of predicted changes in life 
expectancy. Specifications no. 3 and 5 appeared to be the most relevant. In the 
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best projections of the period approach, a stable mortality level and a declining 
age-related increase were generally found for women. An opposite 
combination, declining level and stable age-related increase, were selected as 
the best projections in the cohort approach for both sexes. For French and 
Norwegian men in the period approach, further drops in the general mortality 
level will be needed to bring about an increase in life expectancy, and the age- 
related increase in mortality will remain stable in the coming years. For Dutch 
and Italian men, assumption no. 6 had to be used. 

The alternative formulations of trajectories for the Gompertz parameters offer 
a flexible tool for modelling the future. However, even a small modification of 
the trend in one of the two parameters can result in dramatic changes in the 
resulting life expectancy. This was shown by the example of Dutch men and 
can be explained by the strongly nonlinear nature of the Gompertz function. 
The most difficult specification to model is specification 6. Here, changes in 
life expectancy are not easy to predict and experiments must be used to find 
the optimal trajectories and to formulate the best projection models. 

Summing up, if no quantitative indications are available for (old-age) mortality 
change in the future, the choice of hypotheses remains subjective. Ideally, this 
choice should be based on quantitatively expressed expectations obtained from 
explanatory models for mortality. In this case, we know with a reasonable 
degree of certainty under what scenario future trends would be reached. 
Examples of this approach can be found in the literature devoted to estimates 
of maximum life expectancy (e.g. Manton, Stallard and Tolley, 1991; see also 
Chapter 2 in this volume). Another option is offered by qualitative 
assumptions which ultimately are expressed as a quantitative indicator of 
future trends. The qualitative assumptions introduce however a larger degree 
of subjectiveness into the analysis. 



7.6 I Patterns in Forecast Outcomes: Summary of the Results 

In the previous section, we showed how the forecasts of mortality in four 
countries were produced and how important forecast hypotheses are for old- 
age mortality. In this section, the differences between the outcomes of the 
three forecasting approaches, the overall mortality by period, cause-specific 
period, and overall mortality by cohort approaches, are discussed. 
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We begin with a summary of the differences between the three types of 
approaches in terms of trends in age-standardised mortality rates {Figure 7.4). 
In general, the lowest mortality has been predicted for men by the cause-of- 
death approach and for women by the overall mortality by period approach. 
The highest variant for both men and women was found in the cohort overall 
mortality model, the prediction for women being particularly high, in some 
cases higher than recently observed. There is a clear absence of international 
harmonisation of the forecasts. No attempts to harmonise the forecasts were 
made, and therefore three distinct outcomes were obtained. Convergence or 
divergence of predicted trends is purely an extrapolation effect. Choosing 
among the three types of forecasts appears to be a difficult task and further 
explanation is needed. This is given in Sections 7.6.1 and 7.6.2, using the 
concepts of life expectancy and the annual percentage rate of change, measures 
which are more informative than SMRs. 



7.6.1. Overall Cohort versus Overall Period Mortality: a Cross-National 
Comparison 

When the overall period and cohort approaches are used to forecast old-age 
mortality, two very different levels of future mortality are obtained^ {Table 
7.3). 

Around 1993, (observed) life expectancy at ages 40, 60 and 80 was similar for 
women in all countries except France, where it was considerably higher. In the 
forecasts of overall mortality by period, this general pattern persists until 
2020 . 



^ One possible approach to comparing mortality by period and by cohort is either to 
transform period patterns into cohort patterns or, conversely, cohort patterns into 
period ones. We opted for the second alternative. In this case, the comparison covers 
the entire prediction horizon and is shown over calendar time. 




Table 7.3. Life expectancy in France, Italy, the Netherlands and Norway in 2020 according to the period, cause-of-death and cohort 

projection approaches 
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Figure 7.4a. Empirical and forecasted age-standardised mortality rates. 
Overall period, cause-specific period forecasts 

Cause speoHc approach, men. 40<^ 




year 
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Figure 7.4b. Empirical and forecasted age-standardised mortality rates. 
Overall period, overall cohort, cause-specific forecasts 

Cause specific approach, women. 40> 




Overall period approach, women. 40+ 




Overall cohort approach, women, 40+ 




Norway — Italy — Netherlands — France 






Table 7.4. Observed and projected average annual rates of change in SMRs by forecasting approach 
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However, life expectancy of French women is predicted to be even more 
extreme, implying a larger difference compared with the other countries, 
whereas that of Italian women appears to be slightly underestimated.^ 

For men, around 1993 rather small differences were noted between the four 
countries in life expectancy at all selected ages, in particular at 40. At age 60 
and 80, men in France and Italy topped the list, and the Netherlands and 
Norway lagged behind slightly. In the 2020 forecasts based on overall period 
mortality this general pattern persists, one exception being men in the 
Netherlands whose life expectancy by 2020 was lower than expected at any 
age. The finding for Dutch men is an effect of the recent trends in the 
Gompertz parameters P7 and P8. Note that the overall mortality period 
approach for women results in forecasts which show a slightly higher annual 
rate of decline than the percentage annual decline observed in the recent past 
(1975-1993; Table 7.4). For men, the forecasts of overall mortality by period 
show the opposite, i.e. a slightly lower annual rate of decline than the 
percentage annual decline observed. 

With respect to the forecasts obtained from the cohort approach, we see that 
cohort life expectancies are very low in 2020 for each sex and each country. 
For women at lower ages (i.e. 40 and 60), life expectancies are not only low, 
but they are also at currently observed levels, or even below these levels. 
Norwegian women are the only exception, with a reasonable level in the 
forecast of cohort mortality, in particular at ages 40 and 60. 

For men, as in the observations around 1993, the cohort forecasts do not show 
large differences between the countries by 2020. The forecasts are relatively 
low, but above current levels of life expectancy at ages 40, 60 and 80. This is 
an important difference compared with women. The forecasts for men in the 
Netherlands are clearly higher than the forecasts produced by the period 
approach. This suggests a conservative forecast of overall mortality by period 
for Dutch men. 



® The forecast for Italy seems to be too low and it may be a bit unreliable due to the fact 
that Italy was the only country for which we were not able to collect single-year 
mortality data and had to use five-year age intervals instead. These poor data 
discouraged us to use trends in Gompertz parameters for Italy to develop a new 
complex forecasting model for this country. The forecasting methodology applied for 
Italy was similar to that for Norway and the Netherlands because these models fitted 
the Italian data better than the models for France. 
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When period and cohort forecasts are compared, higher life expectancies for 
both men and women are generally obtained in 2020 in the overall mortality 
approach by period, the only exception being Dutch men whose period 
forecast is high (i.e. high mortality and low life expectancy). 



7.6.2. Comparison of Overall Period and Cause-specific Period Mortality 

Our investigations clearly show that overall period and cause-specific period 
approaches produce different future levels of mortality, and consequently of 
life expectancy (Table 7.3). For women in all four countries, the overall 
approach results in higher life expectancies than the cause-specific approach. 
This regularity is confirmed at each investigated age, at birth, 40, 60 and at 
80. At age 40, the biggest difference between the two approaches, 2.5 years, 
was found for Norwegian women and the smallest, 0.2 years, for Italian 
women. At ages 60 and 80, the differences followed similar patterns. 

For men, the findings are also clear, except for Norway. For men in France, 
Italy and the Netherlands, the overall approach generally resulted in lower life 
expectancies by 2020 than the cause-specific projections. At 40, levels were 
higher in the cause-specific approach by 2.5, 1.1, and 0.6 years for the 
Netherlands, France and Italy, respectively. For Norwegian men, the overall 
approach generally produced higher life expectancies (by 1.1 years at 40, and 
0.9 at 60, equal values at 80) than the cause-specific projections. T hi s may be 
explained by the fact that the cause-specific projection (i.e. life expectancy) is 
relatively low for Norwegian men, a result implied by past trends in mortality 
by cause of death, mainly by high-level, stable trends in mortality from 
coronary heart disease and external causes and stiU increasing trends in 
mortality from lung cancer. 

It is worth noting that both the total gains in life expectancy over the entire 
projection horizon and the (relative) age distribution of the gains were different 
in the two approaches. For both men and women in the cause-of-death 
approach, a higher percentage of the total gain in life expectancy at birth in 
each country was attributed to changes in mortality after age 60. 

For both genders (jointly and also separately) in 2020, the overall and cause- 
specific methods produce sim il ar (or slightly lower in the overall approach) 
percentages of survivors at age 60, but not at age 80. For each country except 
Norway, the cause-of-death projections result in larger populations aged 80 
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and over, for both genders taken together, by six per cent in the Netherlands, 
by 5.1 per cent in France, and by about 1.9 per cent in Italy. Norway is again 
an exception, with a smaller percentage of survivors (down by 3.5 per cent) 
estimated in the cause-specific approach. The differences are certainly not 
small, in particular for countries with large populations, like France. 



7.7 I Summary and Discussion 

The major goal of this study was to produce three types of forecasts of 
mortality for four developed countries, each set of forecasts being conditional 
on developments in one of the three factors: period, cohort and cause-of-death. 
The three factors are known to influence mortality through the mecha ni sms 
described in Section 7.2. This research focused on testing the role of the three 
factors when extrapolating past trends rather than basing forecasts on official 
national assumptions. Comparisons with the national forecasts can show which 
method resulted in most convergence. 

The three approaches produce different levels of mortality, and consequently 
of life expectancy, the percentage of survivors, and sex differences in future 
mortality. When the two period approaches are compared, for women in all 
four countries the overall approach results in higher life expectancies than the 
cause-specific approach. For men in France, Italy and the Netherlands, the 
overall approach generally results in lower life expectancies by 2020 than the 
cause-specific projections, again noted for ages 40, 60 and 80. Norway is a 
country where in recent years (i.e. after 1970) cause-of-death patterns have 
differed from those in the other countries. For men in particular, since 1970 
mortality from coronary heart disease and from external causes was stable at a 
high level and mortality from lung cancer was still on the rise. This implied 
that cause-specific projections for Norwegian men resulted in rather low life 
expectancies. In effect, for Norwegian men, the overall mortality approach 
produced higher life expectancies than the cause-specific projections. It is 
worth noting that for both men and women in all countries the (relative) age 
distribution of the gains in life expectancy, the percentage of survivors and sex 
differences in mortality were different in the two forecasting approaches. In 
particular, for both genders jointly the percentage of survivors at age 80 was 
considerably higher in the cause-of-death approach. What this implies is that 
the two period approaches are complementary rather than competitive. 
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Cohorts that are most relevant to our projections were born in the 20'*’ century. 
Some of their members are still alive and their mortality patterns incomplete. 
The projections of cohort mortality are therefore particularly uncertain. In this 
study, the levels and patterns of mortality of cohorts bom in the 19* century 
were used to forecast cohort mortality. The choice of cohorts, somewhat 
unfortunate perhaps, was prompted by the completeness of their mortality 
histories and by difficulties encountered in formulating hypotheses for cohorts 
bom in the 20* century. As a result, too high levels of (old-age) mortality 
were likely to result from the cohort approach of forecasting. And indeed, the 
forecasts of cohort life expectancies are low for each sex and each country. 
For women at less advanced ages (i.e. 40 and 60), life expectancy is at 
currently observed levels, or just below these levels. This implies that 
projections of mortality obtained from the cohort approach cannot be taken as 
a reasonable alternative to overall period or cause-specific period approaches. 

If one had to make a choice among the three forecasting approaches, one could 
do this in several ways, the two major methods being quantitative and 
qualitative. A quantitative method of solving the dilemma of choosing among 
the different forecasts would be to look at statistical accuracy of prediction, 
which in the cause-of-death-approach might be worse than in overall mortality 
period approach. Reasons for applying this method are summarised by Alho 
(1991) and reviewed in Section 7.2 of this chapter. Rogers (1995), on the 
other hand, suggests that forecast accuracy is an empirical issue, and depends 
on the particular hi storical period observed and the degree of demographic 
variability exhibited in mortality trends in the countries investigated. Indeed, 
the accuracy of our models in the three approaches and the countries studied is 
not entirely comparable. The quantitative answer does not seem to be the best 
one when choosing among the three forecasts. 

Another way of solving the dilemma is qualitative and is based on paying 
attention to aspects of prediction such as transparency of assumptions, options 
for empirical validation, and utility of the forecasts. Formulating assumptions 
for overall mortality is not an easy task. In the fourth stage of the 
epidemiological transition currently taking place in western Europe, the levels 
of life expectancy are high and further changes in life expectancy can hardly 
be predicted from past experience. Some authors (e.g. Olshansky and Carnes, 
1994 and Carnes and Olshansky, 1993) suggest that the law of dimi ni s hi ng 
returns will hamper a further increase in life expectancy. This is not 
necessarily the only possible scenario for the future, as the view of 'enormous 
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plasticity' of the life span of humans is popular in science as well (e.g. Finch, 
1997). The central premise of the plasticity of ageing is that there are no pre- 
defined limits to the life spans of living organisms. The life spans can flexibly 
change, depending on the prevailing environmental influences that constitute 
the context for the evolutionary forces of life. Due to the fact that the number 
of determinants of mortality is extremely large, empirical validation of 
assumptions for total mortality, both period and cohort, is practically 
infeasible. A simple way of overcoming the difficulties of choosing the 
assumptions for overall mortality is to predict total mortality from mortality 
decomposed by causes of death. Trends in mortality by cause of death can be 
linked with trends in risk factors of diseases - causes of death. Validating 
assumptions for cause-specific mortality can be done using epidemiological 
models of health of the populations concerned. This seems to be far simpler, 
yet it is not simple at all, as only joint efforts of demographers and 
epidemiologists can result in realistic forecasts of cause-specific (and, 
consequently, overall) mortality. Finally, one should also note that forecasts of 
mortality by cause of death are urgently needed for many purposes, one 
example being the estimation of health care costs and, in particular, of the 
disability costs related to the use of health care services in ageing populations. 
Estimates of disability costs should be based on a variety of factors, including 
future levels of cause-specific mortality. Both demographic and 

epidemiological forecasts of mortality by cause are potentially useful in 
estimations of this kind. Having said that, epidemiological and demographic 
forecasts of cause-specific mortality tend to be based on marked conceptual 
differences and therefore have different outcomes. Discrepancies observed 
between these two types of forecasts of cause-specific mortality need to be 
better understood. 

In view of the above, we are convinced that the qualitative approach to 
choosing among different forecasting approaches is more useful to forecasters 
than the quantitative method. In this study, a qualitative method would lead us 
to choose the cause-of-death approach. This should, however, be 
supplemented by overall mortality (period and/or cohort) predictions. A simple 
reason for using multiple forecasting approaches found in our study is that the 
variant closest to the national forecasts was the cause-of-death approach for 
men and the overall mortality period approach for women. 

In order to provide a more general explanation of the need to use multiple- 
approach forecasts, it would be useful to examine the three forecasting 
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approaches in terms of their advantages and drawbacks. Despite the fact that 
projections of cohort mortality do not seem to be comparable to the two sets of 
period projections, we would not conclude that their usefulness is limited. 
Only mortality observed by cohorts can serve to estimate the Ufe-long effects 
of health-related behaviour and lifestyles, which in developed countries, 
together with access to health care, have become the key determinants of 
mortality. In our study, the cohort approach overestimates mortality in the 
future, especially for women. This conclusion cannot be generalised for the 
cohort approach. It was a result of the study design opted for, which in turn 
was prompted by difficulties encountered in formulating hypotheses for 
cohorts bom after 1930. It is clear that cohorts bom in the 20* century are 
most relevant to projections of mortality in the near future. However, in view 
of the incompleteness of these cohorts, modelling cohort mortality is 
particularly difficult and projections of cohort mortality are exceptionally 
uncertain. Whereas for the overall mortality period approach hypotheses need 
to be drawn up about changes in mortality over time, for the overall cohort 
approach two types of assumptions are required: assumptions about changes in 
cohort mortality over time and assumptions about changes in cohort mortality 
over age. In conclusion, we can state that modelling and projecting cohort 
mortality needs new efforts in terms of both the methods and applications 
used. 

With respect to the overall mortality period approach several remarks can be 
made to summarise our findings and support future forecasting strategies. 
Most importantly, more alternative projections of a comparable statistical 
quality can be produced using the overall period approach. The alternative 
projections may be higher and lower than the trend-based projections. They 
depend on the assumptions used for old-age mortality (here, on future trends 
for the Gompertz parameters). In this study, recent trends for the Gompertz 
parameters have been found to be inconclusive, indicating a variety of national 
patterns. In sum, the role of assumptions regarding the future is particularly 
important in the overall mortality period approach. 

It must be stressed that old-age mortality, i.e. mortality after age 60 and in 
particular after 80, is the key component of any projection model in the overall 
period and cohort forecasting approaches. The hypotheses chosen for this 
component are cmcial to the outcome of the forecasts, but their formulation is 
difficult. The hypotheses can no longer be formulated on the basis of past 
experience. They should be based on a source other than mortality. Longevity 
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research is a good alternative. Explanatory models of mortality are extremely 
relevant. Finally, qualitative analyses of mortality prospects offer a useful 
approach as well. 

The cause-of-death-approach is more encouraging than the other two 
approaches. This may be explained primarily by the simplicity of the 
parameterisation functions for mortality by cause and by the fact that 
epidemiological knowledge can be used in formulating hypotheses for the 
future. This approach also has a number of difficulties. Most importantly, the 
cause-of-death approach is a process component approach. As such, it might 
produce a bias in prediction which is related to the aggregation of the 
components — causes of death. The bias is mainly a result of mis- 
classifications of deaths which imply cross-correlations between mortality 
trends from different but related causes. Mis-classifications of deaths are 
particularly common in the largest and most heterogeneous category 'the 
remaining causes'. Predicting mortality from the remaining causes as a 
separate, independent category does not seem to be justified. The dynamic of 
the remainder is based to some extent on mis-classified deaths. Another 
difficulty of the cause-of-death approach is that the underlying cause of death 
cannot be easily identified for ages after approximately 90 years, which 
implies that cause-specific projections can only extend up to an (aggregated) 
relatively low age, such as 90 years and over. Lastly, different causes are 
meaningful for different segments of the population, e.g. for children, adults 
and the elderly. Therefore, cause-specific projections can be made for only 
one sub-population at a time. 

The above remarks should not discourage researchers from applying this 
approach. As long as the weaknesses of this forecasting method are 
recognised, the cause-of-death approach can be recommended as an 
exceptionally informative and relatively 'safe' method of forecasting mortality. 
The following detailed recommendations can be given with respect to this 
approach. 

• Conceptually, the cause-of-death approach should be seen as a method of 
improving forecasting of mortality in terms of stmctural (age and cause) 
patterns and not necessarily in terms of the statistical quality of prediction. 
The importance of forecasting mortality by cause of death is related to the 
fact that disability related to chronic diseases — causes of death is becoming 
a serious problem in the ageing populations of today. Estimates of the costs 
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of medical treatment of chronic diseases are conditional on future levels of 
disease prevalence. 

• In (demographic) practice, the cause-of-death approach should be ideally 
used in conjunction with the overall period (or overall cohort) approach. 
The overall approach is particularly suitable for long-term forecasting of the 
general level of mortality. Summary mortality indicators such as life 
expectancy at birth or age-standardised mortality rates are examples of the 
preferred indices in long-term prediction. These indices guarantee a 
mi nimum aggregation-related bias of prediction. The cause-of-death 
approach might be risky when used for long-term forecasting (e.g. up to 
2050). For short-term forecasting, the cause-of-death approach seems to be 
'safe'. 

• The selection of causes of death should be different for different population 
segments. For each population segment, the selection of causes is cmcial to 
the outcome of the prediction. It is important to distinguish main groups of 
causes, and within each group the leading causes and the remaining causes. 
The category 'remaining causes' should not be predicted as a separate cause 
but, for instance, as a difference between overall and cause-specific 
forecasts. 

• The cause-of-death approach does not seem to be the best approach for 
forecasting mortality of the oldest-old. Another modelling method, for 
instance parameterisation of overall mortality, should be considered for this 
purpose. Recently, three models were recommended by the UN Working 
Group on “Projecting old-age mortality and its consequences” (UN, 1996): 
a modified Gompertz function as in the Heligman-Pollard model, the Coal- 
Kisker model (ibid.), and the Himes-Preston-Condran relational model for 
mortality (ibid.). The third of the three alternatives was used in our study. 

Forecasting models serve to quantitatively express the opinions of the 
forecaster. When applied mechanically, any model will fail to produce 
convincing predictions. Besides data and statistical methods, all models require 
an input in the form of expectations formulated by the forecaster. Expectations 
tend to be qualitative. They are transformed during the modelling process into 
quantitative forecasts. Several stages must be followed in order to obtain the 
required outcome: model specification, estimation, testing, and finally 
prediction. Complex models are better suited than others to performing these 
tasks, but most importantly, all models, simple and complex, need excellent 
hypotheses for the future. These hypotheses should be central to forecasting 
mortality in the near future. 
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Abstract 

Mortality projections tend to be based on past trends in mortality statistics. 
These demographic projections do not deal explicitly with trends in major risk 
factor prevalence and disease survival. To assess the relevance of such trends, 
we applied a multistate life table to make lung cancer and coronary heart 
disease mortality projections for 2015. We implemented different scenarios for 
smoking prevalence and for disease survival in the Netherlands. We 
subsequently compared our results with demographic disease-specific mortality 
projections. 

The exercise demonstrates substantial discrepancies between the results of 
trend extrapolation of cause-specific mortality and those of the epidemiological 
projections using a reference scenario for risk factor prevalence. Differences 
in lung cancer mortality can be largely explained by trends in smoking 
prevalence over the last four decades. Differences in coronary heart disease, 
however, can be explained only for a small part by past trends in smoking 
prevalence. The improved survival from myocardial infarction is of much 
greater relevance. 

Mortality projections can be improved by incorporating underlying 
epidemiological processes, such as risk factor prevalence and disease survival. 
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Demographic and epidemiological projections should be regarded as being 
complementary rather than mutually exclusive. 



8.1 I Introduction 

In conceptual models of public health, an individual's health is regarded as 
being a result of interacting endogenous and exogenous determinants. Many 
epidemiological studies have shown that exogenous risk factors such as 
unfavourable lifestyles, lack of education, unemployment or occupational 
hazards may have a major effect on the risk of developing chronic diseases, 
depending on the individual's inherited or acquired susceptibility, of course. In 
recent decades, the distribution of risk factor prevalence has changed in a 
rather heterogeneous manner, partly as a result of socio-cultural developments, 
increased income and improved education, and partly also due to preventive 
policies (Gunning-Schepers and Jansen, 1997). The health care system is 
another important determinant of health, affecting incidence as well as survival 
of chronic diseases (Ruwaard and Kramers (eds.), 1998). 

Demographic projections of future mortality are, however, often made by 
extrapolation of past trends. Although sophisticated statistical methodology is 
employed, these forecasting models do not explicitly reflect existing 
knowledge of the forces that shape patterns of mortality, such as the 
distribution of major risk factors over a period of time (Willekens, 1990; 
Manton et ai, 1993 and Tabeau et at, 2000). It is therefore unclear whether 
the huge social, cultural and economic developments that have taken place in 
the Netherlands and other developed countries since World War 11, and their 
influence on prevention and health care, are captured adequately in 
demographic trend extrapolations. To shed some light on this issue, this 
chapter compares results of trend extrapolations of cause-specific mortality 
with those of epidemiological mortality projections, employing an 
epidemiological chronic diseases simulation-model (Hoogenveen et ai, 1998). 

In our epidemiological projections, we investigate the association between 
smoking on the one hand and lung cancer and coronary heart disease on the 
other hand. We discuss the implications of a number of alternative scenarios 
formulated in terms of smoking prevalence and coronary heart disease survival 
in the context of mortality from these two diseases. Under the reference 
scenario, smoking prevalence and coronary heart disease survival follow past 
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trends. Alternative scenarios adjust the trend in order to match the results 
obtained by trend extrapolation of mortality from lung cancer and coronary 
heart disease. 

In the Netherlands, as in most western countries, lung cancer is a major cause 
of death among males. Lung cancer among Dutch men accounted for a death 
rate of 90 per 100,000 in 1994. It is estimated that approximately 85 per cent 
of these deaths were caused by smoking. The burden of coronary heart disease 
(CHD) is also substantial, as indicated by the 1994 male and female 
prevalence rates of more than 11 and 8 per 1,000, and the 1994 death rates of 
155 and 115 per 100,000, respectively, for males and females in the 
Netherlands (Ruwaard and Kramers (eds.), 1998). Among several risk factors 
of CHD, smoking accounts for between 20 and 40 per cent of the cases 
(Gunning-Schepers and Jansen, 1997). In recent decades coronary heart 
disease mortality has declined substantially. Much of this decrease may be 
attributed to better medical treatment of acute myocardial infarctions (e.g. 
intensive care, trombolytic Pharmaceuticals, aspirin). Furthermore, there 
appears to be a shift towards congestive heart failure, which is, at least partly, 
a result of improved survival (Bonneux et al, 1997; Bots and Grobbee, 1996 
and Hiini nk et al, 1997). 

Section 8.2 of this chapter describes the chronic diseases model used in the 
analysis. The types of scenarios are addressed in Section 8.3.1. Section 8.3.2 
presents the results of the analysis for lung cancer and for coronary heart 
disease. In Section 8.4 the results are summarised and discussed. 



8.2 I The Chronic Diseases Model 

Several simulation models of chronic diseases have been developed and used 
for public health policy evaluations; see, for example, Weinstein et al (1987), 
Gunning-Schepers (1988) and Barendregt and Bonneux (1998). At the 
Netherlands National Institute of Public Health and the Environment (RIVM) 
these types of models are being applied as tools to describe chronic diseases 
and their relation to population risk factors. The chronic diseases model 
(CDM) used in our study was developed by Hoogenveen et al (1998). Its 
structure is related to the conceptual model devised for the 1997 edition of the 
Dutch report 'Public Health Status and Forecasts' (Ruwaard and Kramers 
(eds.), 1998), a leading publication issued every four years, summarising the 
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different aspects of public health and public health policy in the Netherlands. 
In accordance with this conceptual model, health status is influenced by 
determinants that are affected hy health policy (see 2.1 in Chapter 2). 
Autonomous developments may affect these three public health aspects and 
their relations. Health status can be seen as comprising several layers, namely 
disease-specific indicators such as disease incidence and prevalence figures, 
the consequences of diseases and disorders for physical, mental and social 
functioning, mortality figures, and measures that combine information on 
mortality and quality of life. The determinants can be split into two groups: 
health care (somatic and mental) and prevention. The prevention determinants 
can be divided into exogenous determinants (i.e. physical environment, 
lifestyle and social environment) and endogenous determinants (both 
hereditary and acquired). 



8.2.1. General Structure 

The model stmcture chosen to simultaneously describe the relations between 
several risk factors and diseases is a system-dynamic multistate model based 
on the life-table method, also known as the mover-stayer/increment-decrement 
model (see, for example, Barendregt and Bonneux, 1998; Chiang, 1968; 
Gunning-Schepers, 1988; Land and Rogers, 1982 and Manton and Stallard, 
1988). Two major components of the model stracture are model states and 
state transitions (Figure 8.1). 

The model states have been defined on the basis of the risk factor classes and 
disease states distinguished. The state numbers are the population numbers in 
the model states as given by the distribution of the total population by risk 
factor classes and disease states. The state transition numbers are the 
population numbers that 'flow' between the model states in successive years. 
The flows are simply the population subgroups moving between the risk factor 
classes, disease incidence and recovery from a disease, and mortality numbers. 
The core of the chronic diseases model stmcture is a demographic module that 
describes annual changes in the population numbers specified by risk factor 
classes and resulting from changes in births, deaths, migrations, ageing and 
changes in risk factor levels of individuals ('risk behaviour'). The population 
was initially stmctured using demographic data from Statistics Netherlands. 
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Figure 8.1. Basic structure of the chronic diseases model 
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The risk factor prevalence rates have been estimated on the basis of Dutch 
registration and monitoring projects, specified by gender and age where 
possible. For every one-year time step, the following changes in population 
numbers have been calculated: 

• The numbers of newborn have been taken from the official forecasts of 
Statistics Netherlands; 

• Population numbers have been increased by the net migration rate; 

• Population numbers have been reduced as a result of mortality, specified by 
gender, age and cause of death. The disease modules describe cause- 
specific mortality; mortality due to all other causes has been generated 
externally; 

• Transitions between the risk factor classes have been completed using one- 
year transition rates. The risk factor class transition rates have been 
obtained from longitudinal studies or time series of cross-sectional studies. 

• New subpopulation numbers have been obtained for subsequent years by 
ageing. 
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The demographic module can be linked to specific disease modules. Each 
disease module describes the morbidity and mortality numbers over time for 
one specific disease category. Disease incidence is described in relation to the 
risk factor prevalence numbers and disease-specific relative incidence risks, 
cause-specific mortality using case fatality rates. Disease remission and 
progression are described for the specific disease and the number of disease 
states distinguished. The transition numbers are calculated with the aid of one- 
year remission rates and disease progression rates. 



8.2.2. Model Characteristics 

The underlying mathematical model is stochastic and describes the change in 
mean population numbers in the model states. Setting the migration to zero, 
the ratio of the population numbers in the model states to the i ni tial population 
numbers equals the probability distribution function for any initial person. This 
equivalence is valid because all model equations are linear with respect to the 
risk factor class and disease prevalence numbers. In comparison, this 
equivalence is not valid for infectious disease models that are based on the 
Anderson-May model structure because the number of newly infected persons 
is assumed to be proportional to both the numbers of susceptible and infected 
persons. 

The model satisfies the Markov-property: conditional on the current model 
state (i.e. conditional on gender, age, risk factor classes and disease states), the 
probability distribution of the model states one time-step ahead is independent 
of the past model states. One consequence of this assumption is that the 
residence time in a model state has no effect on the (outflow) state transition 
probabilities. For example, conditional on gender, age etc., the survival 
probability of any stroke patient is assumed to be independent of his past 
disease period. 

The model state variables can be divided into two subgroups, those to be 
interpreted as 'risk factors' and those to be interpreted as 'disease states'. 
According to the conditional independence assumption (common in the context 
of public health modelling), the disease transition rates are assumed to be 
mutually independent, conditional on the risk factors. For example, the 
survival probability of any coronary heart disease patient does not depend on 
the patient's disease state with respect to lung cancer, conditional on gender. 
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age and smoking. As a result of this conditional independence assumption, the 
joint probability function for all disease prevalences is equal to the product of 
the marginal probability functions, conditional on the risk factors. The 
conditional independence assumption is empirically valid for most 
combinations of specific risk factors and diseases (for counterexample, see 
below). 

Comorbidity is defined as the prevalence rate for one disease being dependent 
on the prevalence rate for another disease. Given the assumption of conditional 
independence, the CDM stracture describes comorbidity as resulting from 
common risk factors, that is age and the epidemiological risk factors 
distinguished. There may be dependency relations between diseases that cannot 
be explained by the common risk factors. For example, diabetes mellitus is a 
risk factor for coronary heart disease that is independent of gender, age and 
overweight. This dependency relationship can only be modelled by omitting 
the conditional independence assumption resulting in a more complex model 
stmcture. 

Competing death risks are described explicitly by distinguishing different 
causes of death. Individuals may die from any cause during the course of their 
lives. Under the assumption of conditional independence, cause-specific 
mortality rates are independent, conditional on age and risk factors. Because 
any individual may die from any cause during the course of his or her life, 
being saved from any specific cause of death owing to a specific intervention 
programme means he or she will die from the same cause or any other cause 
at an older age. People who have been 'saved' are assumed to be identical to 
those who have survived, conditional on the risk factors (see also Wong, 
1977). 

8.2. 3. Main Model Equations 

Omitting the dependency of all numbers on gender, age and time, as well as 
omitting birth and migration, the differential equations for the disease 
prevalence probability function are described by: 
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Change in disease prevalence numbers: 

d/dt PREV“ = 

- E,inc‘''"POP^ 

- rem^PREV^ 

- m^PREV'* 

- Zz m‘ prev'’" PREV 

- Zzm”^PREV‘‘^ 

Change in population numbers: 
d/dt POP 

- ZiZzm‘PREV‘’'^ 

- Zzm“'^POF 



new cases 
remission 

disease-specific excess mortality 
mortality from other modelled 
diseases 

mortality from remaining causes 



mortality from modelled 
diseases 

mortality from remaining causes 



where PREV, prev 
POP 
inc 
rem 
m 
m“ 
d,z 



disease prevalence numbers, rates 
population numbers 
incidence rate 
remission rate 
excess mortality rate 

mortality rate for all remaining (not modelled) causes 
index of (modelled) diseases and risk factors 
respectively 



The model equations describe the life course of one cohort in accordance with 
the model states distinguished. To describe the total population, these 
equations are applied to all cohorts. At each point in time, the change in cohort 
distribution is computed conditional on the old one. Because of the linear 
dependency of the age variable on (time) period and cohort, the (birth) cohorts 
grow older (ageing) with time. 



8.3 I Analysis 

We compared the results of our epidemiological projections with the cause- 
specific mortality projections produced by the Netherlands Interdisciplinary 
Demographic Institute (NIDI) (Tabeau and Huisman, 1997). These demo- 
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graphic projections were based on dynamic parameterisation of time series of 
Dutch sex-, age- and cause-specific mortality statistics (Tabeau and Tabeau, 
1995). 

8.3.1. Scenarios 

Three scenarios have been designed assuming different trends in disease 
survival and development in smoking prevalence {Figure 8.2). The three 
alternatives included a reference scenario, a feasible improvement scenario and 
a scenario to match the results obtained by trend extrapolation of mortality 
from lung cancer and coronary heart disease (demo-graphic projections). 

In the reference scenario both future smoking prevalence and coronary heart 
disease survival are based on trend extrapolation. Smoking prevalence follows 
the trend observed between 1986 and 1995, resulting in a decrease from 37 
per cent smokers in 1994 to 34 per cent in 2015 for males and from 29 to 28 
per cent for females (Netherlands Foundation for Public Health and Smoking, 
1997). For males, coronary heart disease mortality declined by 4.2 per cent 
per year over the period 1987-1993, resulting in a decrease in case fatality 
from 3.2 per cent in 1994 to 1.7 per cent in 2015 (Bonneux et al, 1997). For 
females, the same case fatality rate is assumed as that for males. 

The feasible improvement scenario assumes that firm tax measures and 
persistent anti-smoking campaigns resulted in a reduction in the percentage of 
starters by 20 per cent in the period 1994-1996, after which this level was 
maintained. In addition to this, a reduction of 14 per cent in adult smoking 
prevalence could be achieved in 1994 and followed by a reduction of 1-2 per 
cent in successive years (Van Genugten et ah, 1999). Smoking prevalence 
would further decrease to 19 per cent for males and 18 per cent for females in 
2015. 

Finally, in the scenario to match epidemiological and demographic projections, 
smoking prevalence and case fatality rates were modified until the results of 
the two projection types more or less coincided. 



8.3.2. Results 

Figure 8.3 and Figure 8.4 present the demographic and epidemiological 
projections, the latter for three scenarios. 
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Figure 8.3a. Standardized lung cancer mortality in different scenarios 
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Figure 8.3b. Standardized lung cancer mortality in different scenarios 
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Figure 8.4 a. Standardized coronary heart disease mortality in different scenarios 
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Figure 8.4b. Standardized coronary heart disease mortality in different scenarios 
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The decline in lung cancer mortality among men in the reference scenario is 
much lower than in the demographic projection. Only a hardly realistic 
decrease in smoking prevalence from 37 per cent male smokers in 1994 to 
around 5 per cent in 2015 would yield results si mil ar to the epidemiological 
projection. The mortality reduction in the feasible scenario is somewhere in 
between the two others variants. 

In all projections for females, lung cancer mortality increases, but the absolute 
number is stiU four or five times lower than for males. For women, the 
demographic projections yield higher mortality rates than the epidemiological 
model. To match the results of the two projection methods, smoking 
prevalence among women would have to increase from 29 to 40 per cent in 
2015. 

As was the case for lung cancer, there is a marked discrepancy in coronary 
heart disease mortality between the results of the reference scenario and the 
demographic projections: 20 per 100,000 for men, 35 per 100,000 for women 
in 2015. Matching the results of both methods requires an unrealistic decrease 
in smoking prevalence of 37 per cent in 1994 to 4 per cent in 2015 for males 
and from 29 to 5 per cent for females. In addition to that, case fatality would 
have to drop from 3.2 per cent in 1994 to 1.0 per cent in 2015 for males and 
to 0.6 per cent for females. 



8.4 I Discussion 

Our exercise demonstrates substantial discrepancies between the results of 
trend extrapolation of cause-specific mortality and those of the epidemiological 
projections using a reference scenario for risk factor prevalence. 

Differences in lung cancer mortality can be largely explained by trends in 
smoking prevalence over the last four decades. While there has been a 
dramatic decrease in smoking prevalence among men starting in the late sixties 
(from almost 90 per cent in 1960 to 40 per cent in the 1980s), in the same 
period women rapidly caught up with men. After 1975, the striking trends 
stabilised among both sexes, and then developed into a very small annual 
decrease. Only recently, however, have the rapid changes in smoking 
prevalence been reflected in lung cancer mortality. This may be explained by 
the existence of a latency period of several decades between the start of 
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smoking and the incidence of lung cancer. Demographic projections are 
therefore largely determined by the conspicuous trends of the 1960s and 
1970s, overlooking the stabilisation in smoking prevalence that has taken place 
in the last 20 to 25 years. 

For coronary heart disease, the declining mortality rates can be explained only 
for a part by changing trends in smoking prevalence. The latency period 
between the start of smoking and the incidence of coronary heart disease is 
shorter, around five years, while the relative risks found in cohort studies vary 
between one and three (Van der Mheen, 1996). For the sake of comparison: 
the relative risks of lung cancer amount to 12 for female smokers and 22 for 
male smokers. The impact of improved survival from myocardial infarctions, 
and a shift towards other heart failure-related causes of death is probably 
larger than the impact of trends in coronary heart disease risk factor 
prevalence (Bots and Grobbee, 1996 and Bonneux et al, 1997). Trend 
extrapolation will therefore slightly underestimate mortality caused by 
coronary heart disease. 

For the purpose of disease modelling, we had to make major simplifications 
that are relevant to the validity of the forecasts. For example, our population 
has been divided into gender- and age-specific homogenous and discrete 
groups, such as never-smokers, smokers and ex-smokers; or healthy people, 
lung cancer cases and coronary heart disease cases. In the real world, we find 
a broad, often continuous range of risk factors and health states. Another 
important assumption was the independence of disease risks. The model 
cannot deal with correlated genetic or acquired susceptibilities for different 
causes of deaths, as in the case of coronary heart disease and congestive heart 
failure. In our model, people who survived their myocardial infarction have 
the same mortality risk as any member of the population. This results in an 
underestimation of the associated mortality. 

Despite these shortcomings, our exercise demonstrates that underlying 
epidemiological processes, such as (rapid) developments in risk factor 
prevalence and medical technology, may actually be relevant to future cause- 
specific mortality projections. Almost by definition, these are not explicitly 
included in traditional demographic models. As both the demographic and the 
epidemiological mortality projections have a different specificity with related 
disadvantages, they should be regarded as being complementary rather than 
mutually exclusive. 
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Abstract 

The most striking development of Dutch mortality in recent years has been the 
continuous rise in life expectancy for men and the stagnation for women. 
Another striking feature is the increase in mortality of the elderly aged 80 
years or over which contrasts with the continuous declines observed in most 
other western European countries. When predicting future mortality, these 
features should be taken into careful consideration in order to present and 
justify reasonable forecasts. 

This chapter reviews the method and assumptions of the mortality forecasts 
made as part of the 1998 official population projection for the Netherlands. It 
begins with an overview of trends in life expectancy and in mortality by age in 
the Netherlands (Section 9.1). Section 9.2 presents a discussion of the most 
important determinants of mortality considered in the official Dutch mortality 
forecasts. In Sections 9.3 and 9.4, some methodological issues are discussed, 
such as selection processes in cohorts and the usefulness of mortality by causes 
of death in forecasting mortality. Section 9.5 summarises the prospects for 
mortality developments in the future. On the basis of this information, it is 
assumed that life expectancy will continue to rise for both sexes, although 
more for men than for women, up to 80 years for men and 83 years for 
women by 2050. The death rates of people between the ages of 30 and 60, in 
particular, are expected to drop significantly. Finally, Sections 9.6 and 9.7 
discuss the method, assumptions, and uncertainty of these forecasts. With 
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respect to the uncertainty, the 95 per cent forecast interval for life expectancy 
at birth in 2050 is assumed to be between 74 and 86 years for men and 
between 77 and 89 years for women. 



9,1 I Introduction: Recent Trends in Mortality in the Netherlands 

The most striking aspect of recent developments in mortality in the 
Netherlands is a slowing down of the increase in life expectancy of women 
and a continuous rise for men. Consequently the sex gap has narrowed. The 
major cause of the stagnation in life expectancy of women is the increase in 
mortality of middle-aged women in recent years. Another striking feature is 
the increase in death rates for men and women at age 80 years or over which 
is opposite to the continuous decline in most other western European 
countries. 

During the entire period from 1950 to 1988, there was a large increase in life 
expectancy for women which then stagnated from 1988 onwards. Life 
expectancy for men increased only slightly in the 1960s and even fell in the 
1970s, but then it began rising again and has continued to do so ever since 
(Figure 9.1). As a result, the life expectancy of females has increased by 
almost eight years since 1950 and that of males by less than five years. As a 
consequence, there is a marked change in sex differences in life expectancy. 
At the beginning of the 1980s it was over 6.5 years, but it has decreased since 
then and is now about 5.5 years (Figure 9.2). 

From 1950 to 1995 mortality rates for young people decreased much more 
than the rates for older people. For elderly women aged between 60 and 90 
there was also considerable progress. On the other hand, mortality rates for 
men aged over 60 years did not decrease much. 

In very recent years, other striking developments have taken place. First, 
mortality rates for middle-aged women have started to increase (Figure 9.3). 
Second, mortality rates for elderly people have been rising. For more than a 
decade now, mortality rates for men aged over 80 have increased and more 
recently mortality rates for elderly women have also started to rise. Possible 
causes of this trend are discussed in Section 9.3. 
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Figure 9.1. Life expectanq/ at birth in the Netherlands 
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Figure 9.2. Sex differences in life expectancy (F-M) 
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Figure 9.3. Change in mortality rates ratio (1993-1997) / (1988-1992) 
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These recent developments in mortality imply that the question about future 
changes in Dutch mortality is very important. The central issue in this chapter 
is the future of mortality in the Netherlands. 



9,2 I Determinants of Mortality 

Reviewing the most important determinants of mortality provides good insights 
into the main characteristics of this process and can therefore form the basis for 
formulating the assumptions needed (first qualitative, then quantitative) in 
mortality forecasting. This section briefly discusses the most important 
determinants of mortality in the Netherlands. Thus it provides the basis for the 
choice of assumptions used for mortality by Statistics Netherlands in the official 
projections of the Dutch population. 

The factors influencing mortality are numerous and the impact of separate 
effects and interrelationships is rather complex. Scientists agree on some 
topics, but there are disagreements on some very important issues, such as for 
instance future possible life span (longevity). At the outset, we can make a 
distinction between internal (endogenous) and external (exogenous) 
determinants of mortality. The former include heredity, personal 
characteristics and sex. The latter include education, lifestyle, living 
arrangements, housing and working conditions, environment and native 
country. These factors influence mortality through aU kinds of intervening 
physical risk factors, such as hypertension, high cholesterol or obesity. 

Sex is a basic biological factor influencing mortality. Women live longer than 
men, but the difference is not constant through time. Around 1950 the 
difference was much smaller than nowadays, about 2.5 years, which is less 
than a half of the current level of almost six years. The main cause of the 
widening gap has been a rather unfortunate mortality trend for men in vascular 
diseases, lung cancer and other lung diseases. Cigarette smoking is the major 
common risk factor of these diseases and has been estimated by Peto et al. 
(1994) to account for almost 35 per cent of male mortality in the Netherlands 
in 1990, compared with only 5 per cent of female mortality. Valkonen and 
Van Poppel (1997) show that more than a half of the sex difference in life 
expectancy in the period 1985-1989 in the Netherlands was caused by 
smoking. Women started smoking in greater numbers much later than men. 
And so the sex difference of mortality has narrowed in recent years. 
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Another cause of the sex difference is the fact that unlike men, women are less 
affected by the consequences of lower socioecono mi c status which implies only 
slightly higher mortality rates for women. The impact of working conditions on 
mortality of men and women is apparently not the same. 

In the Netherlands the sex difference in life expectancy is decreasing. This is 
probably caused by similarities in lifestyles of the sexes. But unfortunately, 
apart from smoking behaviour, the impact of other health-related behaviour on 
mortality remains largely unexplored. 

Marital status is usually considered when representing the group of 
demographic determinants of mortality. Married people have been shown to 
have lower mortality rates than those who are not married, and especially 
those who are divorced. The patterns observed can be characterised more 
accurately by examining the differences in causes of death. Joung (1996) 
showed that in the 1980s accidents and violence were responsible for a large 
part of the excess mortality of unmarried groups in the Netherlands. Cancers 
and circulatory system diseases accounted for a smaller part of the excess. A 
more difficult issue relates to two mechanisms which may cause the excess 
mortality of the unmarried population. “Selection” in marriage may be one 
cause (those in bad health are less likely to marry than healthy individuals). 
“Causation” through marriage may be another (partners care for each other 
and this has a positive causal effect on survival). Although some research 
results (ibid.) suggest that for men causation may be stronger than selection, 
the dilemma has not been definitely resolved for the Netherlands. 

Living environment belongs to the material determinants of health. There is 
little proof that quality of the natural environment affects mortality (WHO, 
1995). The most important environmental factor appears to be air pollution 
caused by suspended particulate matter (SPM), smog and aerosols. According 
to the WHO study the high mortality in eastern Europe is caused by 
socioecono mi c factors rather than by the poor quality of the environment. 

In addition, urban-mral differences in mortality are often discussed within the 
context of the environmental determinants of mortality. In particular, mortality 
in large cities tends to be high due to air pollution, drugs, high alcohol 
consumption and violent deaths. Again the causal relation is complex. 
Interaction effects are likely: people with risky habits move to large cities. 
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For many countries the relationship between individual health and 
socioeconomic status is well established: mortality of the lower socioeconomic 
classes is higher than of the upper. In some countries the gap has widened 
recently (Kunst and Mackenbach, 1995). Noin (1993) concludes that it is 
particularly the gap between unskilled workers and the rest of the working 
population which is widening. 

Individual behaviour is an intermediate factor representing influences of more 
distant mortality determinants, such as socioeconomic class or living 
environment. With respect to stress, one could expect a satisfactory life to 
increase its duration. Physical exercise has a positive effect on mortality as 
well. However, consumption patterns may have both positive and negative 
health implications. Experts agree about the negative role of smoking. The 
impact of diet however is less straightforward, except for the knowledge that 
saturated fats have a strong negative effect on cardiovascular diseases (the 
effect on cancer is less clear). The consequences of the consumption of other 
products are unclear, especially with expert knowledge changing over time. 
For instance, the cause of the decline in mortality from stomach cancer was 
attributed to the fall in the salt consumption, but this view was recently 
abandoned. 

Strong indications for the effect of diet on mortality are the observed regional 
differences in cancer mortality. In China and Japan breast and prostate cancer 
are much less prevalent than in North America and western Europe. The type 
of fat consumed in these regions probably plays a role, even though no real 
proof has yet been provided. Other factors such as lifestyles and the 
environment (for instance different soils and therefore different consumption 
of essential minerals) could be also partly responsible for the regional 
differences observed. Genetics is usually rejected as a possible explanation. 
After a few generations migrants show the same pattern of mortality as the 
native population. 

There are indications that some of the determinants are interrelated and may 
have common or combined effects on mortality. T hi s phenomenon is known 
as interaction ejfects. For instance, a large proportion of the marital status 
difference in mortality is probably caused by differences in individual 
behaviour. Divorcees live relatively unhealthy lives and the mental health of 
widowed people is also bad (Van Hoorn, 1993). To give another example: the 
high mortality rates of the lower social classes may be partly attributed to 
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individual behaviour (smoking and unhealthy eating habits), and partly to bad 
working and living conditions, social stress and unfavourable childhood 
circumstances, such as for instance, negative factors during the mother's 
pregnancy. However, selectivity by health may also have a role to play. 
Unhealthy people have greater difficulties in obtaining a good socioeconomic 
position. 



9.3 I Cohort Effects in Forecasting Mortality 

Analysing cohort effects in mortality is as helpful in formulating assumptions 
for prediction as investigating trends and patterns in determinants of the 
process. However, quantitative estimates of cohort effects in Dutch mortality 
are not applied due to the difficulties of obtaining unique estimates of these 
effects. We explain these difficulties at the end of this section. 

There are two contradictory hypotheses about cohort effects on old-age 
mortality, based on different mechanisms. The first mechanism relates to the 
assumption that changes in old-age mortality are affected by selection. Cohorts 
born at the end of the 19* century lived in harsh times under unfavourable 
conditions. In these cohorts frail individuals died long before they reached old 
age and therefore mortality rates of older survivors were low. Younger 
cohorts have lived under ever improving conditions (apart from relatively 
short periods of crises and wars) and therefore more individuals in bad health 
are still alive aged 80 or over. Mortality rates of the elderly are higher in 
younger than in older cohorts. Due to weaker selection, mortality rates of the 
elderly tend to rise in younger cohorts. An additional explanation of a negative 
development of old-age mortality might be the impact of wars: wars can lead 
to a selection mechanism due to excess mortality of young healthy men. 
However, the latter mechanism has not played an important role in the 
Netherlands. 

The second mechanism, protection, is based on the assumption that younger 
generations have lived under better conditions, their health is therefore less 
'damaged' than that of older cohorts and hence they will live longer. Under 
this hypothesis, mortality rates for the elderly will continue to decline over the 
long term. 
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In the Netherlands mortality rates of the elderly (over 80 years of age) have 
not declined further in the last decade, contrasting with most other European 
Union countries where mortality of this age group is still falling. The current 
stagnation of mortality of those aged 80 years or over in the Netherlands 
seems to confirm the selection hypothesis (cohorts with growing numbers of 
relatively frail persons). Due to the fact that the elderly in the Netherlands 
have experienced significant improvements in living conditions and health 
care, there are still a relatively large number of weaker individuals who would 
have died at younger ages in earlier times. This is manifested in the increasing 
death rates of the elderly in the Netherlands. If this assumption is correct, one 
could expect the mortality of elderly people to start rising in other countries 
too. 

The increase in old age mortality does not necessarily imply that the protection 
mecha ni sm is not at work. Its effects can be weaker than those of the selection 
mechanism. 

Cohort effects of mortality can be estimated by using age-period-cohort (APC) 
models (see also Chapter 1 of this book). An important difficulty of the APC 
analysis is the problem of identification. Separating age, period and cohort 
effects is obscured by the three factors' multicolinearity. In order to identify 
the effects of age, period and cohort, constraints must be imposed on the 
parameters of the model. Both the choice of constraints as well as the 
estimation period affect the estimation results. Vermunt (1991) for example, 
shows that different (overlapping) estimation periods result in different 
estimates of the cohort effects for the same cohorts. 

An additional problem of using the APC analysis for forecasting purposes is 
that various types of interaction effects occur. For example, a sharp decline in 
infant mortality in the past can be interpreted as a period-age interaction. 
However, if according to the selection mecha ni sm this leads to an increase in 
mortality in old age, a cohort-age interaction is said to occur. A forecast based 
on a model with the period-age but without the cohort-age interaction may 
result in overestimating future life expectancy in the long term. 

In summary, if applying the APC modelling, particular care should be taken 
when drawing conclusions. In addition, the existence of interaction effects 
implies that the usefulness of APC models for forecasts is limited, as it 
increases the number of assumptions to be made about future changes in 
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parameters. It is for these reasons that APC models are not used in preparing 
Dutch population forecasts. 



9,4 I Usefulness of Causes of Death for Forecasting Mortality 

Some demographers believe that analysing mortality by causes of death can 
improve forecasts of overall mortality (see for instance Caselli, 1996; or 
Chapter 7 in this book). Disadvantages of this forecasting approach have been 
in turn noted by others. For instance, Alho (1991) came to the conclusion that 
using causes of death did not improve the statistical accuracy of his mortality 
forecasts for the USA. Murphy (1990), Alho (1991) and McNown and Rogers 
(1992) suggested that, particularly when non-linear models are used, 
implausible figures may result for rapidly changing causes of death. 

It is often the case in short-term forecasting and the application of trend 
extrapolation that the use of causes of death does not lead to considerably 
different results than those obtained from trend extrapolation of overall 
mortality. One should expect a high degree of similarity if the following 
conditions are met: 

• trends of the leading causes of death and of total mortality are quite regular 
and more or less linear; 

• for all causes of death and overall mortality the same extrapolation method 
is used; 

• no sharp changes in trends are assumed for the future; 

• no special adjustments are made for some causes of death (for instance to 
avoid negative numbers of deaths). 

The above hypothesis seems to be confirmed by the fact that the forecasts of 
Dutch mortality by cause of death made by Caselli and Egidi (1992) indeed 
resulted in values of life expectancy that did not differ much from the forecasts 
by Statistics Netherlands. Recent forecasts published by the Netherlands 
National Institute of Public Health and the Environment (Ruwaard and 
Kramers (eds.), 1998) which are based on trend extrapolation of mortality by 
ten leading causes of death also produce rather si mi lar life expectancies for the 
period to 2015. 
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For long-term forecasts, extrapolating trends in mortality is less useful as a 
forecasting method, irrespective of whether by cause of death or by aU causes 
together. In making assumptions about life expectancy levels some 50 years 
ahead qualitative arguments based on expert opinions should carry more 
weight. This is because of the questions that need to be answered in long-term 
forecasting. For example, in long-term forecasting an assumption has to be 
made on whether or not breakthroughs in gene therapy will lead to strong 
increases in life expectancy. Another assumption is needed concerning the 
extent of progress in medical science and its impact on health care. 

When preparing long-term forecasts of mortality by cause of death, trend 
extrapolation is not a proper forecasting method (although it is still in use) and 
instead one has to make plausible assumptions about developments in 
physiological and other risk factors that will affect separate causes of death. A 
major difficulty in this respect is the lack of knowledge about the precise role 
of many risk factors. Expert opinions about risk factors evolve as more 
knowledge becomes available. For instance, opinions have recently changed 
with respect to cholesterol; the causal effect now appears to be much more 
complicated than previously appreciated. For some diseases (for instance 
stomach ulcer and cancer) new risk factors have been found recently and 
previously held opinions proved false or at least incomplete. More and more 
infectious agents are detected as probable causes of cancers and auto-immune 
diseases, but it is not clear how these findings are to be used in medical 
treatments. Moreover, there is a lot of uncertainty about possible detrimental 
side effects of the use of medicine in the long term. 

Finally, gene factors are important for all causes of death, including accidents 
and for some causes they are the major determinant. On the other hand, many 
causes of death are also strongly related to differences in socioeconomic status. 
But the effects of both genetic and socioeconomic factors on diseases and 
mortality are not easy to predict. 

When discussing the use of causes of death in prediction of total mortality, 
some other general points are worth mentioning too. One important problem is 
the extrapolation of trends in mortality from the causes of death where those 
trends have changed rapidly in recent years. And so within a few years, the 
forecasts of mortality from these causes tend to become quite extreme. Causes 
of death with rapidly declining trends tend to disappear completely, whereas 
those with increasing trends tend to become predominant. 
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When forecasting mortality by cause of death the results depend very much on 
the choice of extrapolation method (linear or non-linear) and of base period. 
For example, international studies predicting mortality by cause of death for 
several European countries show that for some causes, future trends for 
different countries cross over (Caselli, 1996 and Tabeau et al., 1998). This is 
because trend extrapolation for countries with relatively high mortality from a 
certain cause in the distant past and downward trend in the recent past may 
result in forecasts that are lower than the forecasts for countries with stable and 
relatively low mortality, or for countries in which mortality was low in the 
past but recently showed a moderate upward trend. Diverging trends in the 
countries make it even more difficult to formulate uniform plausible 
assumptions for the future. 

Another difficult issue in forecasting mortality by cause of death, is the so- 
called 'competing' mortality, i.e. the impact of a decline in one cause of death 
on the occurrence of deaths from other but related causes. Until now, there 
has been no satisfactory proposal for solving this issue. When forecasting 
mortality by cause, there is the usual assumption of independence of mortality 
from different causes of death, which is an oversimplification of the problem. 
There are indications that positive as well as negative relationships exist 
between causes of death and they are related to specific risk factors, types of 
medical treatment and other underlying determinants of health. Thus strong 
interdependency between causes bears a closer resemblance to reality. 

In the Dutch population forecast information on mortality by cause of death is 
used for interpreting mortality trends. However, in making the quantitative 
assumptions on future changes in mortality and life expectancy no distinction 
is made by cause of death. In order to improve accuracy in mortality 
forecasting, the prediction of cause-of-death specific mortality should be based 
on quantitative information about the impact of risk factors, socioeconomic 
determinants, genes and mutual correlation of mortality from related causes of 
death. However this ideal is not yet feasible. 



9,5 I Future Prospects for Dutch Mortality 

Several major aspects of mortality are taken into account in the formulation of 
future prospects for Dutch mortality: trends in overall mortality, developments 
in causes of death and determinants and prospects for longevity. 
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One general assumption underlying the official forecasts is that it will be hard 
to make further progress in the medical treatment of those diseases for which 
great improvements have already been made. Furthermore, the assumption 
can be made that in the short term not all diseases will benefit equally from 
improvements in health services, progress in medical knowledge and healthier 
lifestyles. For instance, little progress has been achieved in the treatment of 
most cancers and a breakthrough is probably part of a distant future. The 
breakthrough may be related to genes and their mutations which, it is 
suggested, contribute to the development of cancer. Because many genetic 
factors are involved, it is likely that only a moderate and gradual reduction of 
cancer mortality can be expected. 

According to the National Health Institutes in the USA and the Netherlands, 
genetic treatments have so far not been particularly successful, despite many 
efforts and various experimental techniques used. In me short term, gene 
therapy may be most successful (if at all) for some quite rare diseases. Most 
experts believe that gene therapy for common diseases can positively affect 
longevity of cohorts bom after the year 2010 (Ruwaard and Kramers (eds.), 
1998). By 2050 these generations will be 40 years old and will have low 
mortality rates that will not affect life expectancy at birth very much. The 
health of older cohorts will improve with medical progress too, but as in the 
past, this process will be gradual and reductions of mortality will be small in 
the next decades. 

Deaths from most medical causes will fall. First of all, mortality from 
cardiovascular diseases can be expected to decline. Most experts foresee 
advances in medical treatment and reductions in the prevalence of 
cardiovascular risk factors such as smoking, bad diet and hypertension. Also a 
further decline in stomach cancer is expected. On the contrary deaths caused 
by some cancers, especially lung cancer in women, will continue to rise. 

As far as the prospects for longevity are concerned, the experts are divided: 
the optimists and the pessimists. The quintessential question is whether there is 
an upper biological limit to the length of human life (for instance, 1 15 years 
seems to be the nowadays). If ageing is seen as an intrinsic process in all 
human cells, the existence of a maximum life span is likely (Duchene and 
Wunsch, 1991). If one sees ageing as a multidimensional process of organ 
interaction in which partial loss of function in one organ is synergistically 
compensated by other organs, life span would not have a fixed upper limit. 
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Only total failure or loss of a necessary organ system would result in death. In 
particular, future advances in genetic treatment have the potential to largely 
extend life span (Manton, 1991). 

However, setting aside the issue of whether new advances in medicine will be 
far-reaching and available in the near future, there is another question about 
whether all individuals will be able to benefit from new biotechnology (e.g. 
Day, 1991). The high costs of new treatments may become an insurmountable 
barrier for many members of the society. These kinds of painful decisions in 
the health care system are being discussed now, for instance about selection of 
people who may receive human transplant organs. 

Furthermore, new threats to life may arise from new vimlent diseases for 
which no treatment or immunisation is available or from old ones for which 
the treatment is no longer effective. If today's major causes of death, such as 
cancer and heart disease, were eliminated, gains in life expectancy would only 
be moderate (Olshansky et at, 1990). So without medical breakthroughs, life 
expectancy at birth is not likely to rise much above the level of 85 years of 
age. 

For these reasons the official Dutch population forecasts expect a moderate 
increase in life expectancy (see also Van Hoorn, 1997). Most countries of the 
European Economic Area make similar assumptions in their forecasts with a 
few experts foreseeing a decrease in life expectancy due to the appearance of 
new diseases such as AIDS, resistant bacteria and the negative effects of 
environmental pollution. The official forecasts of Dutch mortality make the 
assumption that these negative factors will have less impact on mortality than 
positive factors like medical progress. 



9.6 I Mortality in the 1998 Netherlands Population Forecasts 

9.6.1. Method of Forecasting 

Gomez de Leon and Texmon (1992) distinguish four methods for forecasting 
mortality: (1) extrapolation of age-specific mortality rates, (2) extrapolation of 
parameters of a mortality model, (3) use of observed (low) mortality of a 
country, region or another particular population as a benchmark, and (4) 
forecasting by means of disaggregation. Expert judgement is of importance in 
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the practice of population forecasting. Firstly, the choice of forecasting method 
is based on judgement. Secondly, this judgement is needed to address the 
question of whether the trends observed will continue in the long term. The 
answer to that question cannot be based only on purely statistical criteria. 

Extrapolation of age-specific mortality rates is based on the assumption that 
trends in mortality by age will continue in the future. Usually a logarithmic 
transformation of the rates is used. Either a linear trend model can be applied, 
assuming an undiminished continuation of observed changes, or a non-linear 
trend, with for instance a levelling off of the rate of change in the long term. 

The major feature of method 2, extrapolation of parameters of a mortality 
model, is a reduction of the number of extrapolations to be made. Instead of 
projecting all separate age-specific rates, a limited number of parameters of a 
model age schedule are projected. Sometimes only a few substantive 
arguments are used to justify the choice of a model, in other cases a 'law of 
mortality' is sought. However, this method disappoints as there are almost no 
models around with a few easily interpretable parameters. 

Method 3 is obvious. Often the level of mortality in Sweden or Japan is used 
as a benchmark. 

Disaggregation of mortality can be based on various criteria, for example, 
marital status, region, social class, or by distinguishing causes of death, each 
criterion stressing the importance of another group of determinants of 
mortality. Projecting mortality by analysing its components is rarely used in 
official national forecasts. 

Eor long-term forecasts one might question the validity of the assumption that 
trends will continue. Experts may expect developments that have not yet been 
observed but are possible according to their judgement. And thus t his 
judgement provides an important contribution to the making of assumptions 
for long-term forecasts. When doing population forecasting, it is common 
practice to set target values on the basis of experts' judgements. 

The methodology used in Dutch population forecasts can be considered as a 
mixture of trend extrapolation of age-specific death rates and expert 
judgement. Assumptions about life expectancy for the short term are based on 
an extrapolation of recent trends. Assumptions for the target year 2050 are 
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based on a combination of time series analyses, study of the literature and 
expert opinion (judgement). The values of life expectancy for the years in the 
period between the short- and long-term forecast horizon are determined by 
the means of interpolation using a third degree polynomial. 

Since the time when Gompertz proposed his function scientists have searched 
for a 'law' of mortality. At first functions with two or three parameters (e.g. 
Makeham) were used. Later more complex models were developed. In general 
models with more parameters produce a better fit to observed age profiles of 
mortality rates. However, as Keyfitz (1990) points out, a good fit can hardly 
be obtained for mortality at all ages from models with a low number of 
parameters. Nowadays the Heligman and Pollard model with eight parameters 
is often used (e.g. Tabeau et al, 1998). The parameters represent features of 
mortality at different ages. Using the model age schedules as proposed by 
Heligman and Pollard, the parameters of the fitted function can be projected 
instead of separate age-specific mortality rates. 

The Dutch population forecasts do not use parameterisation of mortality age 
profiles. Instead, they apply an age schedule of future changes of mortality 
rates. This schedule is based on past observations, but it is adjusted on 
qualitative grounds (see Figure 9.4). The main underlying assumption of this 
schedule is that the survival curve will become more rectangular, i.e. life 
expectancy at birth will rise because more people will become old rather than 
because more old people will become even older. This implies that mortality 
rates at middle age will decline more strongly than mortality rates at the 
highest ages. 

When making assumptions about future changes in life expectancy at birth it is 
important to consider that there is a non-linear relationship between changes in 
age-specific mortality rates and changes in life expectancy at birth. One 
consequence of the rectangularisation of the survival curve is that mortality 
rates have to decline ever more markedly in order to attain a linear increase in 
life expectancy at birth. For example, if all age-specific mortality rates 
measured in the Netherlands in 1950 are reduced by 50 per cent, life 
expectancy at birth increases by about nine years. However, a 50 per cent 
reduction of the 1995 mortality rates leads to an increase in life expectancy of 
only 7.5 years. It is for this reason that Dutch population forecasts assume a 
gradual decline in the increase in life expectancy at birth. 
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Figure 9.4. Mortality rates in 2050 (1995=100) 
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9.6.2. Forecasting Assumptions 

In the short term, (linear) trend extrapolation is used to predict Dutch mortality 
with slopes estimated from models describing most recent trends in overall 
mortality. A statistical procedure defines the length of the period taken into 
account when estimating the slopes. In this procedure a linear spline model 
(see the appendix) is applied to past long-term trends in mortality in order to 
estimate significant turning points. The turning points divide the entire 
observation period into sub-periods, each showing a different rate of change in 
overall mortality over time. It became apparent that the latest significant 
changes in the slope of the trend occurred in 1988 for women and in 1981 for 
men. These values were assumed as the current slopes of the trends and were 
used to project changes in life expectancy in the short term. 

Assumptions used in long-term forecasting of Dutch mortality only marginally 
involve trend extrapolation. These assumptions are primarily qualitative 
statements formulated on the basis of the overview of determinants (Sections 
9.2 and 9.3) as well as of general prospects for disease development and 
longevity (Section 9.5). The assumptions serve to guess the values of life 
expectancy for men and women by 2050 and are the following: 

• the sex difference in life expectancy will reduce to three years by 2050 (it is 
assumed that about half of the recent sex difference can be attributed to 
differences in smoking in the past and that in the future women will smoke 
about as much as men); 
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• for men it is assumed that the latest observed trend reflects long-term 
developments, including medical progress and changes in socioeconomic 
structures, living and working conditions and individual behaviour; 

• for women it is assumed that the latest observed trend (no improvement) 
does not reflect a long-term development, but rather a temporary stagnation 
due to the increasingly negative influence of smoking in the past, 
comparable to the negative development for men in the 1960s; 

• the change in life expectancy at birth will gradually slow down due to the 
non-linear relationship between changes in age-specific mortality rates and 
changes in life expectancy. 

In the medium variant, these assumptions lead to life expectancy at birth of 83 
years for women and 80 years for men by 2050. An interpolation by means of a 
third degree polynomial is used to obtain the values of life expectancy in the 
years between 1998 and 2050. Figure 9.5 shows the outcome of the above 
procedure for men and women. 

The increase in life expectancy of men by over five years can only be reached 
when large reductions of the age-specific mortality rates are assumed. In 
particular for the age group 30 to 60 years it is assumed that there will be a 
strong decline of mortality (see 9.4). For people aged 80 years or over it is 
assumed that mortality rates will start to decrease again, contrary to the current 
rising trend. This ties in with the opinion that medical progress will be greater 
than the selection effects assumed to be causing the current increase in mortality 
rates of the elderly. 



Figure 9.5. Life expectancy at birth, 1998 Netherlands Population Forecasts 
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9,7 I Uncertainty of Mortality Forecasts 

Changes in life expectancy at birth are the result of changes in mortality of 
different age groups. When assessing the degree of uncertainty of life 
expectancy forecasts it is important to make a distinction by age, as the degree 
of uncertainty of future changes in mortality differs between age categories. The 
effect of the uncertainty surrounding the future development of mortality at 
young and adult ages on life expectancy at birth is small, because of the current, 
very low levels of mortality at these ages. On the basis of the current age- 
specific mortality rates, 95 per cent of male live births and about 96.5 per cent of 
female live births would reach the age of 50. Clearly the upper limits are not far 
away. According to the medium variant of the 1998 Netherlands population 
forecasts, the percentage of men surviving to age 50 will rise to 97.1 per cent in 
2050 and the percentage of women to 97.5 per cent. A much larger increase is 
not possible. A decrease does not seem very likely either. That would imply for 
instance that mortality from one or more of the following causes of death would 
have to increase: neonatal or infant mortality, accidents, suicide and new 
virulent diseases. From the past experience it may be concluded that these 
changes are either unlikely (accidents, suicide) or would have a small impact on 
total mortality (infant mortality of e t hnic groups, new epidemics like AIDS). 

Assuming that as in the medium variant there is a tendency towards 
rectangularisation of the survival curve and limited growth of the maximal life 
span, there is relatively high uncertainty about the future percentage of 
survivors around the median age of dying. If the percentage of survivors 
around that age was higher than in the medium variant (i.e. if the median age 
was higher), the decrease of the survival curve at the highest ages would be 
steeper than in the medium variant. Thus the deviation from the medium 
variant at the highest ages will be smaller than around the medium age. 

The degree of uncertainty about the future development of life expectancy at 
birth can be specified by making assumptions about the upper and lower limits 
of a 95 per cent forecast interval, i.e. by choosing values for which it is assumed 
that there is only a 5 per cent probability of a higher forecast than the high value 
or a lower forecast than the low value. These assumptions can be based on three 
methods: analysis of forecast errors of forecasts published in the past, statistical 
time series models of life expectancy or of age-specific mortality rates, and 
judgement. One problem of using the first method is that it is barely useful for 
assessing the uncertainty of forecasts for the very long term, since there are 
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hardly any historic forecast errors for the very long term. Moreover even if they 
were available, the question arises as to whether they would be very relevant, as 
they would reflect forecast errors published in the distant past. One solution to 
this problem may be to extrapolate historic forecast errors. A time series model 
can be identified that projects the development of the size of forecast errors with 
the forecast lead time. It appears that a random walk model is appropriate. This 
model projects that the 95 per cent forecast interval for the year 2050 equals 8 
years. 

On the basis of a statistical time series model, e.g. an ARIMA model, a forecast 
interval can be calculated. The development of life expectancy can be described 
by a random walk model with drift. The 95 per cent forecast interval produced 
by this model for the year 2050 equals 12 years. This forecast interval is only 
valid with the assumption that the selected time series model is correct. 
However, since the medium variant does not correspond with the projection of a 
time series model (the reason being that the forecast is not only based on 
extrapolations but also on judgement), the forecast interval of the time series 
model may not be consistent with the medium variant. 

Because the medium variant forecast of life expectancy is not simply based on 
an extrapolation of observed trends, but rather on judgement, the assessment of 
the degree of uncertainty should be based on judgement too. Even if historic 
forecast errors are analysed, judgement plays a role, since in assessing the 
uncertainty of new forecasts an answer has to be given to the question to which 
extent errors of forecasts that were made in the distant past give an indication of 
the uncertainty of forecasts that are made now. However, this does not imply 
that the analyses of past forecast errors and the forecast intervals based on time 
series models are not relevant. In contrast, they provide a benchmark which 
guards against underestimating the uncertainty of new forecasts. 

According to the medium variant 56 per cent of women will survive to age 85 
on the basis of the mortality rates for 2050. The lower limit of the 95 per cent 
forecast interval is based on the assumption that it is very unlikely that less 
than one-third of women will survive to age 85. That would correspond with a 
median age at dying of 81 years. This equals the level reached in the mid- 
1970s. This could become true if for example there was a strong increase in 
mortality from lung cancer and coronary heart disease due to an increase in 
smoking. The assessment is done in a si mi lar way for the upper limit of the 
forecast interval for women and the lower and upper li mi ts for men. 
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Based on the 95 per cent forecast interval for the median ages and the age 
pattern of mortality change which is used in the medium variant (9.4), forecast 
intervals for all other ages are assessed and they are smaller than for the 
median ages (in accordance with the assumption of increasing rectangularity). 

Finally the assessments made result in a 95 per cent forecast interval for life 
expectancy at birth in 2050 of 12 years. For men the interval ranges from 74 to 
86 years and for women from 77 to 89 years. This interval corresponds with the 
interval based on the random walk with drift model of life expectancy at birth. 
The intervals for the years up to 2050 are assessed on the basis of the random 
walk model. 9.5 shows the low and high variants corresponding with the upper 
and lower limits of the 95 per cent forecast interval for life expectancy at birth. 



Appendix: Spline Function 

In order to assess turning points in the trend of life expectancy a linear spline 
function is used (Suits et al, 1978). Using this function the period of 
observation can be divided into intervals with different growth rates (McNeil 
et al, 1977). The linear spline function is defined by: 



y, = ao + ai* t + S,- [P,* {t - i,) * A] + 



where yt is the value of the life expectancy in year t; are the turning points 
(“knots”); S( is the error in year t; Di=0 if i < i,- and Di=l if t>ti. Using the non- 
linear least squares method, the years in which the trend has changed 
significantly can be assessed. 
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10. The Latest Mortality Forecasts 
in the European Union 



Harri CRUIJSEN and Harold EDING 



Abstract 

Chapter 10 reviews the outcomes, methods and assumptions in the latest^ 
mortality forecasts for the European Union countries. Section 10.1 introduces 
the topic, then Section 10.2 highlights general features of national forecasts 
such as, for instance, their timing, type (i.e. forecasts, projections, or 
scenarios), uncertainty variants, length of forecast horizons, and 
disaggregation of the population. Section 10.3 gives the values of life 
expectancy, predicted increases and sex differences by the year 2020. National 
forecasts are discussed within the context of an international perspective. 
Three comparisons of national forecasts are made here: comparisons with the 
previous national results, with the life expectancies from the United Nations 
1996-based population projections, and finally with the 1995-based Eurostat 
long-term mortality scenarios. Section 10.4 discusses the age-related 
assumptions used by the various countries and gives more in-depth 
explanations of the rationale underlying the outcomes of national forecasts. 
Sections 10.5 to 10.7 review the methods applied in the most recent national 
forecasts, the justification of assumptions and the use of variants. Section 10.8 
comprises the principal conclusions. 



’ Forecasts compiled during the period 1992-1997. 
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10.1 I The Growing Importance of Mortality in Ageing Populations 

Mortality is slowly but steadily gaining importance among both forecasters 
and users of demographic population forecasts. The continuing increase in 
life expectancy to levels considered more or less impossible a few decades 
ago and the growing awareness of the fact that the magnitude and speed of 
population ageing is increasingly determined by survival rates among the 
elderly, have shifted mortality issues up several notches on the demographic 
research agenda. Various statistical agencies suggest that more time and 
effort should be devoted to the methods and outcomes of mortality 
prediction. International comparisons of methods and predicted trends could 
be used in the justification of assumptions used in national forecasts. 

Whatever methods and assumptions are used, prediction cannot be separated 
from the trends in recent mortality. In our part of the world mortality trends 
are quite smooth except in the event of an extremely cold winter, a hot 
summer or for instance widespread unforeseen influenza. Significant 
fluctuations are only observed in such exceptional years. However relatively 
bad years are often followed by relatively good ones, resulting in quite 
stable changes in mortality over five-year periods. All this implies that the 
key question for mortality forecasters in more developed countries is 
generally not “What will happen to mortality in the near future?” but “What 
trends and patterns will be observed over the next two to three decades?” 

This paper reviews the latest answers to this question collected from 
forecasters working at the national statistical agencies in the European 
Union (EU) countries. Of the 15 EU Member States, 13 countries compiled 
their latest mortality forecasts during 1992-97. These forecasts were made 
as part of the latest official population projections. Initially for these 13 
countries, an overall impression is given of the content and nature of the 
latest national forecasts of mortality (Section 10.2). Section 10.3 examines 
the official mortality forecasts made ten years ago and those recently 
prepared by international organisations. Section 10.4 investigates future 
differences in mortality by age. Sections 10.5 to 10.7 contain discussions of 
the use of variants by the countries, of their justifications for the national 
mortality assumptions and of the forecasting methodologies. Section 10.8 
summarises this review. 
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10.2 I General Features of the Latest Mortality Forecasts in the 
European Union 

Apart from Greece and Portugal, all the other EU Member States revised 
their national population projections during the period 1992-1997. Some 
countries (Austria, Denmark, Finland, Netherlands and Sweden) update 
their projections annually, but a few countries (France, Ireland and Italy) 
revise their forecasts only when the results of a new population census 
become available. All countries produce a forecast or a 'best guess'. In their 
publications, the Irish demographers only discuss scenarios or variants, but 
in practice they advise users to refer to the population (and therefore also 
mortality) forecast. Fuxembourg presents forecasts up to 2010 (Fuxembourg 
n) and then long-term scenarios (Fuxembourg I) up to 2050. 

The vast majority of countries make one population forecast accompanied by 
at least two uncertainty variants, formulated as both relatively low and high 
population growth projections, mostly due to altered fertility and/or migration 
inputs. Spain has not produced any variants, whereas Austria and Belgium 
have formulated 12 different population projections {Figure 10.1). 

Almost all countries forecast four to six decades ahead (Figure 10.2). Only 
Fuxembourg stops its forecast calculations in the not too distant future: no 
predictions are provided for after 2010. Instead, long-term population 
scenarios are shown up to 2050. 



Figure 10. 1. Number of projection variants 
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Figure 10.2. Length of projection period (in years) 
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In order to warn users against a time-related increase of the uncertainty of 
prediction, some countries change their terminology depending on the length 
of the prediction horizon. For instance the Austrian forecasters speak about 
projections for the period 1996-2030 and about model calculations for the 
period 2030-2050. 

Finally, all countries' national population and mortality projections are 
broken down at the least by sex and single years of age. The highest 
population age group varies from 95-t- to 109 years. 95 years and over 
(95-t-) is used in Austria and Luxembourg, exactly 98 years in Germany, 
99-1- in Ireland, the Netherlands and Spain, lOO-i- in Belgium, Finland and 
the United Kingdom, 105 -h in France and Sweden, 108-)- years in Italy and 
exactly 109 years in Denmark. Most countries use the age 99-i- or lOO-i- as 
their upper age limit. Denmark and Germany assume that everybody dies at 
a certain maximum age. Surprisingly enough, the German forecasters work 
with a maximum age of lower than 100: in their projection model aU people 
who reach the age of 98 years do not survive in the subsequent calendar 
year. 
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10.3 I Life Expectancy in the Fntnre: the Optimistic and the Pessimistic 
Views 

All the national statistical agencies concerned foresee further improvements in 
life expectancy at birth for both males and females for the period 1995-2020, 
but generally more slowly than that observed during the period 1970-1995 
{Figure 10.3) 

The Danish forecasters appear to be the most pessimistic: they expect a gain of 
no more than 0.5 years (females) or 1.5 years (males) in the next two to three 
decades. In fact they assume that the long-standing recorded trend of 
increasing life expectancy will stop around the year 2010. Finland and the 
Netherlands are also fairly cautious for females. The most optimistic forecasts 
come from France, which expects a further improvement of over four years 
between 1995 and 2020. Also the increases foreseen in Austria and Ireland 
(females) are weU above average. 

With regard to the differences in life expectancy between females and 
males, all Scandinavian EU countries expect a decrease of approximately 
one year. 

Forecasters in the Netherlands have assumed a decline in sex differences of 
almost two years. All other national forecasters predict hardly any change of 
male excess mortality. 

Altogether it results in a future in which the international differences in 
life expectancy are supposed to grow, and this would imply a reversal of a 
long-standing recorded trend {Figure 10.4). 

Compared with the forecasts made around 1985, the latest mortality forecasts 
are generally much more optimistic. Ten countries now foresee life expectancy 
for baby boys at least one year greater than previously expected for the year 
2000 {Figure 10.5). Ireland and Sweden predict an additional gain of more 
than two years. Finland and Luxembourg predict the lowest gain of less than 
0.5 years. 

For baby girls too, increased life expectancies are expected in the short run, 
although the differences here are less uniform. The forecasts for Ireland and 
Spain had gone up considerably, whereas those made in Denmark, Germany 
and the Netherlands decreased a little. 
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Compared with the assumptions applied in the 1996-based population 
projections of the United Nations (UN, 1998), the situation is mixed. For both 
males and females about two th ir ds of the national projection makers appear to 
be more optimistic than the UN forecasters {Figure 10.6a-b). In particular, the 



Figure 10.3a. Increase in life expectancy - males (years) 
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Figure 10.3b. Increase in life expectancy - females (years) 
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Figure 10.4. Variance in life expectancy between the EU countries (years) 




French national forecast for females deviates substantially. Within the group of 
national forecasters who are more pessimistic than the UN, Denmark (women) 
and Finland (men) are the extreme cases. 

Finally, compared with the 1995-based base li ne scenario compiled by Eurostat 
(Eurostat, 1997), the latest national life expectancy assumptions are generally 
less optimistic. Again the forecasts made in France (females only) and 
Denmark show the largest deviation from the baseline scenario level, together 
with Austria (more optimism) and Finland (more pessimism) (Figure 10.7a-b). 



10.4 I A Lack of Uniformity in Mortality Assumption by Age 

The differing views on future mortality trends are also visible on the charts 
showing the expected change of mortality rates by age. For example, both 
Ireland and the United Kingdom forecast that female infant mortality rate 
will decline by approximately 60 per cent for the period 1995-2020, 
whereas Germany and the Netherlands foresee a decrease of at most 10 per 
cent. Most other EU countries including Denmark expect a reduction of 
around 30 per cent (Figure 10.8). 
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Figure 10.5a. Male life expectancy for 2000 - differences between the latest national 
forecasts and those made around 1985 (years) 
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Figure 10.5b. Female life expectancy for 2000 - differences between the latest national 
forecasts and those made around 1985 (years) 
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Figure 10.6a. Male life expectancy - differences between national forecasts and UN 
projections, 1995-2020 (years) 
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Figure 10.6b. Female life expectancy - differences between national forecasts and UN 
projections, 1995-2020 (years) 
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Figure 10. 7a. Male life expectancy - differences between national forecasts and 
Eurostat's baseline scenario, 1995-2020 (years) 




Figure 10.7b Female life expectancy - differences between national forecasts and 
Eurostat's baseline scenario, 1995-2020 (years) 
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To take another example, the mortality rate among women aged 80-84 
demonstrates that countries think fairly differently about future reductions in 
old age mortality. Here, France and Ireland again are the most optimistic 
Member States assuming declines of almost 40 per cent. Denmark, however, 
expects no change at all and the Netherlands foresees relatively small 
decreases in mortality among the oldest old. The average reduction over the 
countries amounts to about 20 percent {Figure 10.9). 

Figure lO.lQa-b illustrate that on average the greatest declines are forecasted 
for children aged about 10 and for elderly persons around 65 years of age. 
However very little change in mortality is expected for both men and women 
aged approximately 30 years as. well as for those aged 90 and over. 

In order to show some contrasts, these graphs also include the future 
patterns of mortality change for the countries with the lowest and highest 
expected gains in life expectancy: Denmark and France. It is clear that 
Danish fore-casters have assumed that survival rates among those aged 60 
years and over will only slightly improve (men) or not at all (women). Even 
among those aged around 40, very, little progress is expected. Thus the 



Figure 10.8. Infant mortality rcae. Figure 10.9. Death rates at age 80-84. 
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(moderate) future gain in life expectancy for Denmark is almost completely 
due to a continuous decline of already fairly low mortality rates among 
children and young adults. France however has forecasted a further 
rectangularisation of the actual age-specific survival curve: children, older 
adults and elderly persons will experience further, significant declines in 
mortality rates. Only French men aged around 25 are not expected to profit 
from the general improvement in survival conditions. 

From similar graphs drawn for all the other EU countries, it is clear that EU 
population forecasters have differing thoughts about the future paths of 
mortality progress, although the vast majority of countries seem to work 
with the general assumption that reduction in mortality among elderly 
persons di mi nishes by age. Eor other population age groups, in particular 
for children, and those aged around 30 or 40 years, fairly heterogeneous 
reduction levels are applied. Some countries such as Austria and Ireland 
even expect an increase in mortality levels for some (young) age groups. 
Hence, there is a relatively large variation of views on future age-specific 
changes in mortality for children and a relatively small variation for the 
oldest old amongst the EU forecasters {Figure 10.10c; see also the Annex 
for the age specific mortality rates changes 1995-2020 by country). 



10.5 I A Moderate Use of Variants in the National Forecasts 

One of the certainties in population forecasting is that everybody will die. And 
thus as opposed to fertility and migration where both the intensity and the 
timing of future behaviour have to be predicted, mortality assumptions are 
solely concerned with the (average) age at which the event of death occurs in a 
population. Short-term forecast errors in age-specific death rates can be 
counterbalanced in the medium term. 

However, sensitivity analyses of recent projections and empirical evidence 
from historical projections already indicate that after ten years the cumulative 
impact of stmctural errors in mortality assumptions might lead to relatively 
large forecasting errors, particularly for the oldest old. With the expected 
acceleration of the ageing population in the next decades due to the ageing of 
the post-war baby boom generations and no general consensus among 
demographers about the future course of mortality, one would therefore expect 
a wide use of mortality variants. In practice, however, only six EU countries 
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have supplemented their principal mortality projection by a maximum of two 
additional variants of prediction: Austria, Belgium, Italy, the Netherlands, 
Sweden and the United Kingdom. 

Apart from Austria, which produced just one higher life expectancy variant, 
the five countries listed above prepared a lower and higher life expectancy 
variant situated around the 'best guess'. The margins between the low and 
high variant in the year 2020 range from almost two (UK) to more than three 
years (NL). The Austrian high survival variant is a real outlier: for both men 
and women it shows a difference of over two years in 2020 compared with the 
central projection. 



10.6 I Justifications are Simple and Straightforward 

All EU countries have documented their latest national population forecasts 
in some official publication (see the list of references). These written 
materials provide a global impression of the considerations that were used in 
setting future mortality levels. Most publications are not complete in the 
sense that frequently there is no comprehensive explanation of the reasons 
behind the mortality assumptions and they are neither explicitly mentioned 
nor fully discussed. In addition, in some cases there is no summary in 
English, Erench or German. Hence some forecasters had to be interviewed 
in order to supplement the written information. 

In general it can be stated that all national forecasters rely a great deal on 
the ideal that the future is most likely to resemble the (recent) past. Nobody 
'dares' to foresee major trend shifts, such as a rapid decrease of male excess 
mortality due to recent changes in female lifestyles on increases of mortality 
among the oldest old due to cohort effects. Even countries with 
internationally fairly low life expectancies or relatively large differences 
between male and female death rates do not expect that much will change in 
the next two to three decades. 

In addition to the notion of continuity, all forecasters (still) seem to use the 
basic assumption that improvements in survival rates will become 
increasingly difficult as life expectancy progresses. Although recent 
research in human biology and past experience have proven that the limits 
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of longevity have most probably not been reached yet, all official European 
demographic projection makers assume a slowdown in current trends. 

Another related feature is that most EU countries seem to have prepared 
their mortality forecasts without doing extensive time-series analyses, 
international comparisons or examining published research. However, 
Erance, Ireland, Italy, the Netherlands and the United Kingdom do carry out 
extensive reviews. Other EU countries may lack human resources to do this, 
or simply do not believe in the additional value of such efforts. The use of 
experts' opinions however, has become common practice, but only a few 
countries explicitly expound their views. 

Most countries describe the most significant cause of death trends over 
recent years, but Italy is the only EU country that has made quantitative 
specific mortality forecasts for the cause of death. Other countries opted for 
a limited set of qualitative statements about developments in mortality from 
a major cause of death underpinning long-term life expectancy levels, or 
about future age patterns in the reductions in death rates. 

Explicit assumptions about important determinants influencing specific cause 
of death trends such as tobacco smoking and medical technology are barely 
in evidence. In addition, there is barely a mention of changes in for example 
morbidity patterns and preventative health care systems or the possible 
impact of population ageing and cohort effects. 

Einally, some countries justify their decisions by examining errors in 
previous mortality forecasts. Thus if short-term errors are small one counts 
on the long-term quality of an earlier forecast, and consequently one is 
inclined to leave things unchanged. Or, if gains in life expectancy are 
constantly underestimated, one ultimately decides to switch to (sometimes 
considerably) higher levels of life expectancy. In this respect perhaps it is 
better to evaluate national mortality forecasts' series instead of merely 
reviewing the latest predictions. 



10.7 I Various Forecasting Methodologies in Use 

Broadly speaking two different strategies can be used to forecast age- 
specific mortality rates. The first is to extrapolate death rates, life expec- 
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tancy at birth or other summary indicators. The second is to set a medium 
or long-term 'target level' for the model parameters used, and then to apply 
an interpolation procedure between the latest observations and targets to 
generate annual forecasts. 

At least 3 out of 13 EU countries considered here used the first approach in 
its purest form. Belgium, France and Luxembourg have extrapolated age- 
specific death rates by using age-specific mathematical functions that best 
described past changes. Some corrections were made to avoid non-plausible 
future age patterns or unrealistically low levels. Italy also followed the first 
approach, but extrapolated various parameters using an age-period-cohort 
model. 

Denmark, Finland, Ireland, Spain and the United Kingdom could be seen as 
countries which have used a combination of extrapolation and targeting 
methods. All these countries applied age patterns of reductions in death 
rates. Denmark and Finland assume a constant improvement of survival 
rates until 2010, which could therefore be seen as the target year. The other 
countries assumed that death rate reductions would slightly diminish and/or 
change during the projection period, so that increases in life expectancies 
would be partially postponed or recent increases in mortality rates for 
particular generations would disappear. 

The other four EU countries (Austria, Germany, the Netherlands and 
Sweden) use methodologies giving preponderance to targets. All forecasters 
in these countries began their quantitative assumption-making process by 
setting feasible gains in life expectancy and/or improvements in age-specific 
mortality patterns for (different parts of) the prediction period. These target 
values are derived from observed life tables of more 'advanced' populations 
(Germany), model life tables developed by Coale and Guo (Austria) 
combined with the most recently observed national life table (Sweden), or 
are merely judged by looking at the latest mortality trends, their 
determinants and the application of a number of statistical constraints (the 
Netherlands; see also Chapter 9). 

With regard to the forecasting of mortality by variables in addition to sex 
and age, one can say that just two EU countries use a bottom-up, multi- 
dimensional cohort-component projection model. Germany produces its 
national population forecasts by summing up the results of subnational 
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population forecasts for nationals and non-nationals. Italy also adds up 
regional population forecasts, but instead of the inclusion of nationality, the 
plausibility of regional mortality assumptions is controlled with the 
outcomes of a national mortality forecast by major causes of death. 



10.8 I Conclusions 

Apart from Greece and Portugal, all Member States of the European Union 
revised their official national population forecasts around 1995. All countries 
foresee further improvements in life expectancy at birth, but at a lower rate 
than that observed during the 1970s and 1980s. Denmark and Finland are the 
most 'pessimistic'. Both countries assume that the long-standing recorded 
trend of increasing life expectancy will stop around 2010. Austria, Belgium 
and France, on the contrary, expect life expectancy to increase more or less 
linearly. The other EU countries assume a slowdown in the rate of increase. 
As a result, international differences in life expectancy are expected to 
intensify over the next decades. 

Most official national mortality forecasts turn out to be less 'optimistic' than 
the mortality assumptions used in Eurostat's 1995-based baseline scenario. 
Compared with the 1996-based population projections compiled by the United 
Nations, it appears to be the opposite: the majority of the EU countries expect 
somewhat higher gains in life expectancy. 

Compared with the national mortality forecasts made in around 1985, the 
perspectives set around 1995 for life expectancies in 2000 are substantially 
higher. Most national forecasters have probably become more 'optimistic' 
after being confronted with significant projection errors during the period 
1985-1995. 

With respect to future age-specific mortality patterns, one can observe a high 
degree of diversity. Some countries foresee that the largest decrease will 
continue for infants and young children, whilst some other Member States 
expect strong decreases for those aged 50-80. Several countries assume no 
further or very little progress for those aged around 40 and the oldest old. 

Most EU countries apply relatively simple forecasting methods, the most 
common being the use of constant or slightly decreasing age-specific reduction 
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factors, based mainly on recent trends. Some countries start the assumption- 
making process by 'targeting' life expectancies at birth in a long run (mostly at 
2050), followed by interpolation. Only Italy applies a fairly detailed mortality 
projection model by main causes of death, generation and region. 

Almost all countries consult experts. However, neither their opinions, nor the 
basic qualitative assumptions behind the model inputs and results of time-series 
analyses are noted in the official publications. A few countries report 
international comparisons. 

Finally, despite the growing uncertainties of future mortality trends and the 
increasing impact of mortality on population ageing, seven EU countries have 
compiled mortality forecasts without any uncertainty variants. And those 
countries which did make variants, applied widely differing margins. Hence, a 
common European view on how to express uncertainty in mortality forecasting 
belongs to a distant future. 
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Age-Specific Mortality Rate Changes 1995-2020 
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Figure A2. Age-specific mortality rates - males - changes 1995-2020 
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11. Mortality Models Incorporating 
Theoretical Concepts of Ageing 



Anatoli YASHIN 

Abstract 

Among the basic attributes of ageing, heterogeneity, homeostasis and 
stochasticity were studied extensively in the second half of the 20* century. A 
selection of mortality models developed on the basis of these attributes are 
reviewed in this chapter, including older as well as more recent concepts. The 
older models of vitality, fixed heterogeneity and debilitation, being unsuitable 
for practical use, proved to be appropriate for testing simple hypotheses about 
the importance of certain factors for survival. In contrast, the more recent 
models of changing frailty, and in particular the stochastic process models of 
mortality, allow an explicit description of the physiological mecha ni sms of 
ageing. These models also appeared to be successful in complex applications. 
Irrespective of when they were devised, the models of evolutionary theories of 
ageing are less certain in their conclusions than any other models mentioned 
above. However, all models discussed here have contributed to improvements 
in our understanding of age patterns of mortality. Some of them (i.e. changing 
frailty and stochastic process models) can be safely recommended as a tool for 
both the justification and the prediction of mortality changes in the future. 



11.1 I Introduction 

Many theories and concepts of physiological and biological ageing would be 
more complete if they were consistent with demographic mortality data. 
Gompertz's (1825) attempt to relate the exponential decline in vitality (a 
hypothetical physiological index) to the exponential increase in the mortality 
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rate, stimulated efforts to develop an appropriate theory linking the age- 
specific mortality curves with the age trajectories of key physiological or 
biological indices. Until the middle of the 20* century, theories of mortality 
were primarily descriptive. The leading ones sought to relate Gompertz's 
mortality curve to hypothetical ageing processes that were not accessible to 
measurement. For this reason, the early models were not suitable for further 
development. The situation changed in the second half of the 20* century, 
when many researchers tried to include the basic physiological principles and 
facts, such as heterogeneity, homeostasis and stochasticity, in modelling and 
explaining the empirically observed age trajectories of mortality. Recent 
population and experimental studies have brought new facts that challenged 
existing theories connecting the age patterns in survival chances with 
biological ageing. Now, it is not just the Gompertz curve which has to be 
explained. For a variety of species the mortality rate at older ages does not 
follow the Gompertz law; the rate decelerates and for some species it then 
levels off or even declines at the oldest ages (Vaupel et ah, 1998). What forces 
shape this observed mortality pattern? How can one distinguish between 
primary causes of ageing and the results of homeostatic adaptation to these 
primary causes? These questions stiU remain open for investigation and 
discussion. 

In this chapter, we review several approaches to modelling the age pattern of 
mortality that incorporate theoretical concepts of ageing. The selection of 
mortality models, reviewed in this chapter, includes the older as well as the 
recent constracts. The older theories and models are presented to show how 
the more recent ones have evolved. In this way, we hope to contribute to a 
better understanding of the basic regularities of ageing and survival and to 
show how new facts, new theories and new types of observations influence the 
stracture and scope of applications of mortality models. 



11.2 I Physiological Ageing and Mortality 

A major characteristic of the earlier theories and models of mortality is that 
they associate the dramatic (exponential) increase in mortality at old ages with 
an equally dramatic decline in the vitality of the organism. However, reviews 
of physiological studies of ageing carried out by Shock (1960 and 1974) and 
later by Bafitis and Sargent (1977) showed that various functions 
characterising physiological capacities in humans decline more or less linearly 
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with age. How can a slow, uniform, linear decrease in vital capacity with age 
be reconciled with a rapid, exponential increase in the rate of mortality? 

The idea of relating the linear decline in physiological capacities during the 
ageing process with the exponential increase in the mortality rate stems from 
Strehler and Mildvan (1960), who viewed mortality as a response to the 
random fluctuations of energy demands. Their mathematical model 
combined the Gompertz curve with the assumption that the death rate at a 
given age is proportional to “the frequency of stresses which surpass the 
ability of a subsystem to restore the initial conditions” at that age. They 
showed that, under such an assumption, the exponential increase in 
mortality is related to the linear decline in vitality. 

The mortality model suggested by Strehler and Mildvan (1960) was based on 
the assumption that an organism is a system? which has a certain maximum 
ability to restore initial conditions after a challenge, that is, a change due to 
internal or external fluctuations. This ability, called vitality, declines linearly 
with age. Death occurs when the rate at which the organism works to restore 
the original state is below the level demanded to overcome the effects of a 
challenge. The second assumption was that the magnitude of the responses 
required to overcome a challenge has a Maxwell-Boltzmann distribution 
similar to that of energies between molecules. 

Strehler and Mildvan defined vitality, V(x), as the capacity of an individual 
organism to stay alive at age X. It is equivalent to the minimum degree of an 
environmental challenge sufficient to cause death at age x. According to this 
model, the mortality rate p,(x) is proportional to the frequency of 
environmental challenges tp(x) that are sufficient to cause the death of an 
organism of age x: 

p.(x) = c -(p(x) 

where c is the total number of challenges per unit of time. This frequency 
depends on the vitality V(x) of the organism: 

y > 

V.D 



tp (x ) = ke 
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where D measures the relative deleteriousness of the environment and zD is a 
measure of the average demand for energy. Thus |r(;c) can be formulated as: 

\i\x)= Ke 

where is a constant and equals kc. For the Gompertz curve, p,(x) = 
and 



K(;c) = ei)ln[— 1 

ya ) 



1 - 



bx 



(K 



with K = sDln — , and B = 



V a 



b 



\n{Kla) ■ 



Thus, V{x) declines linearly with age, and B characterises the rate of 

b 

physiological ageing. The relationship lna = lnAr- — between Gompertz's 

B 

parameters is called the 'compensation effect' or the 'Strehler and Mildvan 
correlation'. 



Strehler and Mildvan's (1960) theory was modified by Atlan (1968), who 
suggested expressing vitality in terms of entropy rather than of energy. This 
led to the use of the information theory to predict vitality patterns in ageing. 
Atlan (1968) suggested that vitality, V, be defined as the “entropy of mortality 
activation”, i.e., the increase in entropy needed to kill the organism. 
According to Atlan, ageing is a process of progressively increasing entropy, 
meant as a decrease in the information content of the organism, until a level is 
reached that is incompatible with life. The size of information decrease 
required to kill the organism declines with age. Atlan found that this decline is 
non-linear. The initial decline may be approximated by a straight line up to a 
certain age, followed by a rapid change in the slope that afterwards reaches a 
constant value. Atlan claimed that the linear portion of the curve corresponds 
to the exponential variation in the mortality rate (Gompertz curve) and that the 
subsequent portions correspond to the last part of the mortality curve, which 
deviates from Gompertz for old ages and moves towards a constant mortality 
rate. 
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Strehler and Mildvan (1960) understood the importance of physiological 
homeostasis in ageing, the process whereby an organism tends to support 
itself in a state of equilibrium and to return to this state if moved from it. 
However, they did not include these physiological homeostatic processes in 
their model, nor did they take the stochastic behaviour of physiological 
indices into account. This was done by Sacher and Tmcco (1962), who 
suggested a stochastic mechanism of ageing and mortality, in which the 
individual chances of death are regulated by random fluctuations of internal 
and external origin. 



11.3 I Homeostasis and Stochasticity 

Simms (1948) viewed the stochastic nature of mortality as resulting from 
biological forces acting randomly within individuals who then become 
different as they grow older. The effects of Stochasticity in ageing were also 
observed in the genetic studies of ageing and mortality. In the inbred 
populations exposed to a controlled environment, the life spans of genetically 
identical individuals were not identical — instead, they showed a considerable 
amount of variance (Economos, 1982). This gave rise to the question about the 
sources of this variance. 

The need to include the Stochasticity among the mechanisms of the ageing 
process is related to the fact that many of these mech a nisms remain unknown. 
Cutler (1975), who claimed that Stochasticity is inherent in the ageing process, 
explained its presence by the spontaneous decay of large polymer molecules 
constituting the biological substance of the organism. Stochasticity may thus be 
considered to be responsible for the observation that genetically identical 
individuals exposed to the same environment have different life spans 
(Curtsinger et al, 1992). Formally, Sacher and Tmcco (1962) suggested a 
model of mortality that includes both homeostatic forces and random 
fluctuations of physiological variables. They also assumed a linear decline in 
the homeostatic capacity with age. 

Sacher and Tmcco described the process of individual ageing through a 
hypothetical physiological homeostatic process Y„ satisfying the stochastic 
differential equation: 



dY,=-aY,dt + bdW„ Y, = Y 



( 11 . 1 ) 
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Here, the parameter a, a>0, characterises the homeostatic feedback 
mechanism, the Wiener process W, represents stochastic fluctuations, the 
diffusion coefficient b describes their intensity, and F is a random variable 
with the probability density (function (p.d.f.) Po(y)- The probability density 
function P{yj) of^^tisfies the Kolmogorov-Fokker-Planck equation: 



dP(y,t) ,,d^P(y,t) 



dt 



= b 



dy^ 



■•+a 



dy 



(yP(y,t)),p(y,0) = P,(y) 



(11.2) 



Equations (11.1) and (11.2) give two equivalent descriptions of the Markov 
diffusion stochastic process y, (0ksendal, 1991). The homeostatic forces work 
when Y, belongs to the (interval fy^, yj, where y, and represent lethal 
boundaries. To solve this equation, one needs to specify Po{y) (i-e. the p.d.f. 
of Y^ at time zero) as well as the conditions on the lethal boundaries Yi 
F 2 (Te. the solution must be obtained in the interval y, < y < ). 






(11.3) 



The solution of Equation (11.2) can be used for the calculation of probability 
to stay alive at time t (to be within the lethal boundaries), that is the survival 
function. This function, in turn, can be used to produce the formula of the 
hazard rate. Sacher and Tmcco (1962) obtained this function when 
= -00 and y^ = V , V >0, and found the stationary value of the hazard rate 

P- 






1 aV 

4 ^ a 



(11.4) 



where a = . Then, they assumed that the threshold V declines linearly 

ya 



with agex: 



V[x) = Vo~kx 



(11.5) 



and that 
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AV.kx^«V, (11.6) 

Strictly speaking, Assumption (11.5) made by Sacher and Tmcco (1962) is 
mathematically incorrect since age X and time t are dependent variables. The 
result, however, can be obtained by considering two time scales, one for 
“fast” time, t, and the other for “slow” time, X. (Because of the limited space 
here, we have to omit the details of these calculations). In this case, the 
mortality rate can be approximated by: 

(11.7) 

where a = e~''° and (3 = . Thus, this model, like the 

a a' 

Strehler and Mildvan (1960) model, implies an exponential increase of the 
mortality rate from the l in ear decline of some physiological indicator. 
Economos (1981, 1982) critically discussed and further developed the ideas of 
Sacher and Tmcco (1962) and Strehler and Mildvan (1960). 

In spite of the difference between Sacher and Tmcco's (1962) and Strehler and 
Mildvan's (1960) statistical treatment of mortality, both are quite similar. In 
both theories, all individuals are assumed to be identical. According to Sacher 
and Tmcco, death results from the excess of the “displacement of 
physiological functions”, caused by fluctuations, above the homeostatic 
capacity of the organism. According to Strehler and Mildvan, death is caused 
by the decline in the homeostatic power-generation capacity of the organism 
that drops below a critical level. Both theories are unrealistic hi that the 
assumption of a si mi lar homeostatic capacity for individuals of the same age 
contradicts experimental data (e.g. Simms, 1942 and 1948). That is why the 
model of hidden heterogeneity in survival (frailty) suggested by Vaupel et al. 
(1979) was an important step towards a better understanding of the observed 
age patterns of mortality. In particular, the idea of heterogeneity was used to 
explain the deceleration and levelling off of mortality rates at late ages (Vaupel 
and Yashin, 1985). 
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11.4 I Heterogeneity in Mortality 

Actuaries have been concerned with individual differences in mortality for 
more than a century. Higham (1851) used the idea of heterogeneity to 
explain the duration dependence of hazard rates. Biologists discovered 
heterogeneity in mortality in experimental survival studies. Simms (1942) 
found that some important physiological characteristics connected with an 
individual's 'vitality' are different for individuals of the same age. This 
finding generated a new way of thinking about mortality as it is estimated in 
population studies and about biological and physiological characteristics of 
ageing. It is now understood that the observed force of mortality does not 
necessarily reflect the biological regularities of ageing. At each age, this 
force manifests the average of the individual chances of death among 
survivors to this age and, hence, depends on the composition of a 
population. Heterogeneity contributes to the observed mortality pattern and 
reflects the combined stochastic influence of genetic and environmental 
factors on the process of individual ageing. The idea of genetic 
heterogeneity is indirectly addressed in the genetic theory of mortality and 
ageing developed by Szillard (1959). According to his model, “the main 
reason why some adults live shorter lives and others live longer is the 
difference in the number of faults they have inherited”. Beard (1959) 
proposed a model of heterogeneous mortality which results in the logistic 
curve of the observed hazard. He claimed that the distribution of 
heterogeneity can be looked upon as an index of the genetic make-up of the 
population. He then assumed that the distribution of individuals by cause of 
death reflects this heterogeneity. Strehler and Mildwan (1960) admitted the 
presence of heterogeneity in the population and used it to explain the 
deviation in the force of mortality from the Gompertz curve at old ages. 
They did not, however, include heterogeneity factors in their mortality 
model. Vaupel et al. (1979) conceptualised the idea of population 
heterogeneity in mortality and introduced the notion of individual frailty as a 
measure of individual differences in the chances of survival in the 
proportional hazards model. The properties and paradoxes of heterogeneity 
were discussed in Vaupel and Yashin (1985). The concept of heterogeneity 
turned out to be useful in several other areas such as fertility (Sheps and 
Menken, 1973), sociology (Tuma and Hannan, 1984), survival analysis 
(Andersen et al., 1992), and in other areas. 
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The models of fixed heterogeneity in mortality may be described in terms of a 
joint probability distribution of two random variables, T and Z, where T is the 
survival time, and Z is the heterogeneity variable. If ^(x;Z) is the conditional 
hazard rate then the marginal hazard jJ(x) is the following: 

\£{x) = E{\i{x-,Z)\T >x) 

When |J,(a:;Z) = Zp,o(x) , the respective survival model is called 'frailty 

model' (Vaupel et al, 1979). Manton et al. (1986) showed that the frailty 
model fits the historical Swedish cohort mortality data better than the 
Gompertz-Makeham model without heterogeneity. Yashin et al. (1994) 
confirmed these results, fitting the gamma-Makehani model to more recent 
Swedish mortality data. Despite the good fit, the fixed frailty models may look 
unrealistic since individual differences in the chances of survival may change 
with age (Failla, 1958). Therefore, a changing frailty model was desired. As a 
first step towards this model, the impact of unobserved, randomly changing 
factors on mortality was taken into account in the stochastic process models of 
ageing and mortality (Woodbury and Manton, 1977 and Yashin, 1985). These 
models seem to be important in the analysis of longitudinal data on ageing and 
survival, where changes in heterogeneity factors can be partly observed. 



11.5 I Changing Frailty Models 

Since the models of changing frailty are more realistic, one may expect them 
to be more appropriate for applications than the fixed frailty models. Such an 
expectation is, however, not necessarily correct: it turns out that a wide range 
of the randomly changing frailty (i.e., acquired heterogeneity, or debilitation) 
models produce exactly the same parametric description of the mortality rate 
as the fixed frailty models (Yashin et al, 1994). This finding shows that the 
use of survival data alone is not sufficient to distinguish between different 
mecha ni sms generating the observed mortality patterns. To take advantage of 
more complex models, more sophisticated data need to be used. 

The changes in the heterogeneity variable are not necessarily debilitative. 
Some of them (e.g. recovery from illness) increase the survival chances of 
individuals. The following two models describe this more general situation. 
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11.6 I Finite-State Heterogeneity Process 

Let us assume that the heterogeneity variable changes over time in accordance 
with the trajectory of some finite state, continuous time Markov process with 
transition intensities ’kyit), i,j=l,..N, and initial probabilities /?„ i = l,2,...N. 
Let \ifi) be the mortality rate in state i. Then the marginal mortality rate is: 

N 

7=1 

where iZj(t) satisfy the following system of non-linear differential equations: 
dll (t) 

j = W - H(0X 7^, (0) = Pi 

Clt i=i 

These equations are a particular case of a more general result derived by 
Yashin (1970). The equations for Tij(t) contain some part of the traditional 
Kolmogorov equation and a selection term (Yashin, 1985). A similar 
description exists in the case of continuously changing frailty. 



11.7 I Continuously Changing Heterogeneity Process 

Let us assume that the heterogeneity variable, Y, , changes stochastically over 
time. Let be the respective conditional hazard. Then, the marginal 

hazard jl(t) is given as: 

p(0 = £(p(LL,)|r>t) 

Let us assume that Y, is a Markov process and satisfies the following diffusion- 
type stochastic differential equation: 



dY,=a(t,YJdt + b(t,Y,)dW„ Y„^Y, 

where W, is a Wiener process. Note that in the absence of dynamic changes in 
the heterogeneity variable one deals with a fixed heterogeneity model. To 
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calculate one has to know the conditional probability density function 

(y ) = — P{Y^ < y\r > t) . This function satisfies the following non-l in ear 
dy 

partial differential equation (Yashin et al, 1985): 

^f,(y) = -^{a{t,y)fXy)) + ^{b\t,y)fXy)) + f,(y)Mt) - \x(t,y)J 

ot dy dy 

where |d(t, 7 ) = jl(/) = E(]x{t,Y,)\T > i). 

An equation of this ki nd can not be solved analytically in general cases. 
However, in one particular case this equation can be solved. This is the case of 
the conditional Gaussian model. 



11.8 I The Conditional Gaussian Model 

For many applications, the stochastic process models seem to be more natural 
than the models of fixed frailty. However, their implementation requires the 
specification of the conditional hazard |o,(/, Tj) and of the coefficients of 
equations for the conditional probability density function among survivors, 
ft(y)- The most important special case is the quadratic hazard model. The 
use of the quadratic hazard model for the conditional hazard 
, is justified by the U- (or J-)shaped relationship between 

risk factors and hazards observed in epidemiological studies (Witterman et 
al., 1994). This curve is also consistent with the hormesis principle, which 
states that an appropriate exposure to stress stimulates the organism to 
increase its fitness. The U-shape of the hazard function is consistent with the 
attributes of homeostasis believed to restore optimal values of physiological 
variables after a change occurs in the risk factors. With respect to the U- 
shaped association, the quadratic hazard model with unobserved or partially 
observed, stochastically changing covariates gives a unique opportunity to 
combine data collected in epidemiological and demographic studies. 

The assumptions about the quadratic form of the hazard and the Gaussian 
distribution of the covariate process allow us to find an analytical solution to 
the equation for the conditional probability density function f,(y) (Woodbury 
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and Manton, 1977 and Yashin, 1980, 1985). The marginal mortality rate in 
this case is the following: 

^{t) = X{t){m] +y,) 

where m, and satisfy the following non-linear differential equations: 



dm 



dt 



- = a^it)-a{t)m, -2A,(/)w,y,, mo=m, 



and 



-j^ = -2a{t)y, +b(ty - 2X(t)y f , Jo=y 

The extensions and applications of this model are discussed in Yashin and 
Manton (1997). An important advantage of this model is that partial 
observations of the process for each individual may be combined with 
information about the survival of that individual. This property makes this 
model appropriate for the analysis of data from longitudinal studies of ageing 
and survival, longitudinal data with informative censoring, etc. The model of 
mortality in which the stochastic covariate process characterises both the 
discrete and continuously changing heterogeneity variables is discussed in 
Yashin eta/. (1995). 



11.9 I Repair Capacity and Mortality 

Damage caused to DNA by environmental factors can be healed by enzymatic 
systems termed 'DNA repair' (Bohr and Anson, 1995). As early as 1959, 
Szilard suggested that changes in somatic DNA occurring as a function of time 
cause senescence. This would imply that animal species with the most efficient 
DNA repair systems throughout their life span exhibit the longest longevity 
(Sacher, 1982). 

In this section we investigate possible effects of DNA repair on mortality. The 
theoretical framework we use is the theory of reliability and repair that is 
widely known among technical and other quantitative scientists, less popular 
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though among social scientists. We assume that, in the absence of proper 
repair, spontaneous DNA mutations lead to defects causing rapid death of the 
organism. The proper functioning of the repair mechanism guarantees the so- 
called 'mi ni mal repair'. This term means that when damage occurs, the system 
immediately becomes repaired and returns to the state it was in prior to the 
failure. Vaupel and Yashin (1987) investigated this ki nd of repair in 
demographic applications, where the procedure of minimal repair was meant 
as an immediate 'resuscitation' of an individual after his/her death. We shall 
see below that, despite the theoretical nature of resuscitation, the constmct 
may be useful in drawing conclusions about mortality change under different 
repair capacities of the populations. 

Let |0,(x) be the mortality rate in a given population. Suppose that each 
individual in this population becomes immediately 'resuscitated' (repaired) 
after his/her first death, and that after resuscitation the person experiences the 
same mortality p(x) as before (Vaupel and Yashin, 1987). 

The calculations show that, after one such repair, the resulting mortality rate 
P*(a:) is: 



= |i(x) 



A(x) 

1 -I- A(x) 



where A(x:) = ^\i{u)du. The p.*(x) is lower and increases faster than p,(x). 
0 

Straightforward calculations produce p* (x) , mortality rate in the case of n 
repairs. 

In populations with different standards of living, individuals may have 
different repair capacities of their organisms. The improvement of the 
standards of living in the populations increases the average level of individual 
repair capacities. This explains the observed mortality reduction and the 
negative correlation between the slope and the intercept of the logarithm of the 
mortality curves discussed by Strehler and Mildvan (1960) and Gavrilov and 
Gavrilova (1991). Note that all models discussed so far do not take 
evolutionary considerations into account. 
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11.10 I Mortality and Evolution 

Evolutionary theories of ageing assume the presence of a permanent process of 
random DNA mutations acting through generations and they involve the 
notions of natural selection and fitness (fitness is the reproductive success of a 
species; Medawar, 1952; Hamilton, 1966 and Charlesworth, 1990). Two main 
theories are widely discussed: mutation accumulation and antagonistic 
pleiotropy. According to the theory of mutation accumulation, natural 
selection is less effective at reducing the frequency of later-acting mutations in 
the populations. T hi s is one compelling reason why ageing is expected to 
evolve. According to the theory known as antagonistic pleiotropy, mutations 
that increase fimess at earlier ages do that at the expense of decreasing fitness 
at later ages (thereby also increasing the death rate). As natural selection is 
believed to act more strongly on the earlier, beneficial effects, this property 
can be easily incorporated into the population. Future prospects for survival 
and reproduction are therefore expected to decline later in life. 

Charlesworth (1990) suggested a quantitative genetic model of senescence, 
which includes the effects of mutation pressure. The model predicts that the 
additive genetic variance for mortality rates should increase with age, and that, 
if deleterious mutations are completely age-specific in their effects, then a total 
collapse of survival would occur at later ages. This means that mortality rate 
tends to infinity at a finite age interval, which is again in contradiction with the 
results of the studies discussed by Vaupel et al. (1998). These studies show 
that the mortality rates of large cohorts of the nemathode C-elegans (Brooks et 
al, 1994 and Johnson, 1997), and of the fruit flies Drosophila melanogaster 
(Curtsinger et al, 1992), as well as of humans (Kannisto, 1996) decelerate at 
very old ages. In experiments involving various organisms, the mortality rates 
were found to plateau or even dec lin e at old age and (Carey et al, 1992). 

The hypothetical age trajectories of the mortality rate can be calculated from 
the models based on the antagonistic pleiotropy theory of ageing. These 
models assume the presence of a trade-off between mortality and fertility. The 
quantitative aspects of this approach were first summarised in the disposable 
soma theory of ageing, which linked the age-related increase in mortality with 
the lack of repair capacity of somatic cells (Kirkwood, 1990). More recently, 
Abrams and Ludwig (1995) explored the relationship between mortality and 
age that would hypothetically occur when individuals used the energy and 
resources normally spent on reproduction on the maintenance and repair of the 
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organism. Their approach can be seen to oppose the standard life-history 
theory where an evolutionarily favoured rate of senescence in a stationary 
population should maximise the expected life-time reproductive output V(\Xi) 
(Charlesworth, 1994). Interestingly, the notion of VfpJ is equivalent to 
Fisher's (1930) reproductive value for a non-growing population. In several 
specific cases, Abrams and Ludwig (1995) predicted a marked deceleration of 
mortality rates. However, their optimality approach does not allow the 
deleterious mutations to operate throughout the entire life span. Nor did the 
model of Abrams and Ludwig (1995) produce an exponential increase in the 
age-specific death rates, except when some ad hoc assumptions were made 
about the relationship between the rates of repair and reproduction. Moreover, 
in the study of Abrams and Ludwig the death rate stiH tended to infinity with 
age, contradicting the findings based on empirical data discussed by Vaupel et 
al. (1998). 

Mueller and Rose (1996) developed a model for the calculation of mortality 
trajectories in a hypothetical population using the assumptions of either 
antagonistic pleiotropy or mutation accumulation. They concluded that the 
standard evolutionary theories explain late-life mortality plateaus. Their 
antagonistic pleiotropy model assumes that each genetic mutation occurs at a 
distinct genetic locus. In the simplest case, each locus is assumed to influence 
survival probabilities at exactly two randomly chosen ages: new mutations are 
assumed to increase survival at one age and to decrease it at another. 
Charlesworth and Partridge (1997) and Fletcher and Curtsinger (1998) 
criticised Mueller and Rose's (1996) assumptions on the grounds that they 
were unrealistic. Wachter (1999) investigated one of the models suggested by 
Mueller and Rose (1996) and found that this model did not produce a mortality 
plateau. 



11.11 I Concluding remarks 

A better understanding of why and how we age requires interdisciplinary 
efforts. These include research in actuarial sciences and demography, 
epidemiology and biostatistics, gerontology and geriatrics, genetics and 
molecular biology. There have been numerous recent attempts to investigate 
possible mechanisms of ageing and survival in laboratory experiments with 
living subjects, or in computer experiments with mathematical models which 
summarise knowledge accumulated in different disciplines. The use of data 
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collected in epidemiological, biological and population studies of ageing and 
survival requires inclusion of physiological and biological characteristics in the 
description of the ageing process. The impact of these characteristics on 
mortality must be represented in survival models. The models of Strehler and 
Mildvan (1960), Sacher and Tmcco (1962), as well as the fixed heterogeneity 
and debilitation models provide explanations for Gompertz (exponential) 
patterns and for 'levelling off patterns of mortality (Yashin et al, 1994). 
However, these models are still of very little practical use to the statistical 
analysis of survival, follow-up, or longitudinal data. A certain compromise is 
achieved when using survival models with observed covariates and unobserved 
random effects (Andersen et al, 1992). But these models are more appropriate 
for testing simple statistical hypotheses about the importance of certain factors 
on survival than for studying the mecha ni sms of ageing. 

In contrast, the models of changing frailty are of more practical use and may 
prove convenient for applications. For instance, the stochastic process model 
of human mortality and ageing has been used in the analysis of various types 
of data, including survival and longitudinal studies (Yashin and Manton, 
1997). This model combines the opportunity to account for unobserved 
(genetic or acquired) heterogeneity with the opportunity to use complete or 
incomplete observations of some covariates. The model allows for explicit 
description of the physiological homeostatic mechanism: the stochastic 

differential equations describing the dynamics of the state variables are flexible 
enough to account for self-regulation and non-linearity. The internal 
stochasticity of ageing can also be captured by stochastic terms on the right- 
hand side of stochastic equations for observed and unobserved covariates. The 
analytical description of the model does not become more complex in the 
multifactorial case, i.e. when the dimension of vectors of both observed and 
unobserved factors is high. The quadratic form of the conditional hazard used 
in this model can be justified both biologically and experimentally. Since the 
fixed frailty is a particular case of this model, the Gompertz exponential 
increase in mortality and its post-reproductive levelling off can be easily 
explained. 

The changing frailty models also provide a solid methodological background 
for mortality forecasting. They enable researchers to model the impact of 
unobserved and partly observed stochastically changing covariates on 
individual survival chances. In these models mortality depends on the covariate 
processes included. Thus, mortality forecasting becomes naturally related to 
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the predictions of social and economic development, environmental pollution, 
changes in cultural traditions et cetera. The models make possible the 
transformation of predictions of trajectories of the incorporated factors into 
estimates of future mortality rates. Forecasting may be done separately for 
each cohort. The changing frailty models can also be extended into modelling 
and forecasting of the mortality surface, that is the mortality rate considered as 
a function of age and time. Flowever, such an analysis would require more 
sophisticated mathematical tools, based on the theory of random dynamic 
fields. Other aspects of these models' applications include forecasting of the 
health status of the population. In this case the significant factors may be 
described by the processes with discrete (e.g. for the health status) and 
continuous (e.g. for environmental factors) components (Yashin et al, 1995). 
An important aspect of such a prediction is the presence of a selection process 
that influences the distribution and the average trajectories of the health indices 
in the population. 

Evolutionary biology theories of ageing have made an important contribution 
to our understanding of the evolution of, and relationship between basic life 
history traits. However, these theories continue to be the weakest in explaining 
observed mortality patterns. They are much less certain in their conclusions 
than any of the other models discussed above; mortality does increase during 
the reproductive interval because evolutionary pressure declines with age, but 
its exponential growth does not follow from this theory. The shape of the post- 
reproductive mortality curve rema in s a mystery; so far, evolutionary theories 
have failed to explain not only the age pattern of mortality, but also non-zero 
post-reproductive survival for many species. Studies on evolution in co- 
operation among animals either within or between species in the biosystems 
(Frank, 1998 and Kaufmann, 1993) may provide us with a better 
understanding of the role of the elderly in nature. One basic mechanism, 
which makes different species mutually dependent, is the food supply in the 
biosystem where different species are related by a network of trophic chains. 
From this point of view, individual ageing is a process of preparing an 'easy 
catch', that is elderly individuals, who are regularly 'served' to the predators' 
'lunch table' by demographic forces. Disruptions in the availability of a 
particular item of food can destroy the stability of the biosystem. Such 
disruptions can be smoothed and the biosystem can be stabilised if individuals, 
excessive from the traditional evolutionary point of view, live beyond their 
reproductive age. Thus, non-zero post-reproductive survival may contribute to 
the stability of the biosystems. Note that this explanation does not apply to 
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humans, nor to several other species. In this case, other important mechanisms 
may underlie post-reproductive mortality. New models are needed to better 
understand the regularities of the evolution of life history traits among species 
in the biosystems. 
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Abstract 

In this chapter we summarise the contents of Chapters I to II and draw 
conclusions regarding the methods to be used in modelling and forecasting 
mortality in developed countries in the future. Integrating the experience of 
statistics, demography and epidemiology will contribute to the success of this 
undertaking. How these disciplines can be integrated is discussed following a 
sequence of activities inherent in any forecasting process: the modelling phase, 
the formulation of assumptions and forecasting, and data needs in the future. 
Following an introduction (Section 12.1), Section 12.2 reviews the models of 
mortality presented in this book. In Section 12.3, the formulation of 
assumptions in demography, statistical practice and epidemiology is discussed. 
The data needs are summarised in Section 12.4. Section 12.5 closes this book 
by giving some suggestions for future directions in modelling mortality. 



12.1 I The Need for Information on Future Levels of Mortality 

Accurate mortality predictions are urgently needed for population and health 
forecasting of the elderly population. Forecasts of the age stmcture and health 
status of the elderly are also required for policy and cost-effectiveness 
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calculations. Note that in all funding programmes costs are determined not 
only by the projected size of the elderly population but also by the uncertainty 
of the forecasts of mortality and the health status of the elderly. A cmcial 
finding of recent forecasts based on health data time series is that there is more 
uncertainty than previously thought with respect to the rate of growth of the 
elderly and oldest-old populations, and consequently with respect to the 
resources needed to sustain them. The elderly and the oldest-old populations of 
the future might be considerably larger than currently expected. Manton et al. 
(1993) have shown, for instance for the (US) 85+ population, that the official 
population projections (including their uncertainty limits) made by the Social 
Security Administration and the Census Bureau are all much lower than the 
alternative variants of the US population projected by Ahlburg and Vaupel 
(1990), Guralnik et al. (1988) and Manton and Stallard (1992). 

Due to this uncertainty, interest in methods of mortality forecasting has grown 
considerably. This book was also prompted by the growing interest and its 
major objective is to suggest improvements in forecasting methodology by 
integrating the experiences of the disciplines of demography, statistics and 
epidemiology. In the discussion below, we address formal statistical models of 
mortality, popular in demography, and more practically oriented population 
and health forecasting. The three approaches distinguished have different 
needs, and focus on different aspects of future mortality estimates. We shall 
try to show the specificity of, and possible improvements on each of the three 
approaches. 



12.2 I Models of Mortality: Contemporary Practice 

12.2.1. Demographic Models of Overall Mortality 

In the formal statistical modelling of mortality issues such as model 
formulation, estimation, and evaluation remain central. A recent discussion on 
statistical modelling of mortality was devoted primarily to the complexity of 
demographic forecasting models (International Journal of Forecasting 8(3) 
1992; Mathematical Population Studies 5(3) 1995). In particular, simple 
models were suggested to outperform complex ones in terms of (especially 
short-term) accuracy of prediction. For some models, the quality of prediction 
was shown to be higher when an aggregate was predicted instead of the 
components of the aggregate (Alho, 1991). Complex models in turn were said 
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to be better able to show the general direction of how systems move and of 
explaining the movements. Model performance was also seen as an empirical 
issue, depending on the particular historical period observed and the degree of 
demographic variability ex hi bited during this period (Rogers, 1995). 

In many recent demographic and epidemiological studies, explanations of 
mortality movements were of particular interest. Not surprisingly, complex 
models have become popular and were often used in the recent past. This was 
stressed by Tabeau in Chapter 1 and by Van den Berg Jeths et al. in Chapter 2 
of this volume. New examples of complex models were shown in the 
following chapters by Heathcote and Higgins (Chapters 3 and 4), Tabeau et al. 
(Chapter 7) and Van Genugten et al. (Chapter 8). 

Heathcote and Higgins in Chapters 3 and 4 develop a regression approach to 
the modelling and estimation of mortality. A measure of mortality as a 
function of year and age defines what is called a mortality surface. Several 
different mortality surfaces are of interest but the one selected for investigation 
here was the logistic transformation of the year- and age-specific probability of 
death. This is modelled as a bivariate regression surface. Under standard 
assumptions about cohorts, the error terms in the regression are approximately 
independently and normally distributed. Thus, an appropriate method of fitting 
the regression is by iterated weighted least squares. In the case of the Dutch 
population aged 40 years and over, a quadratic in year and age, together with 
several special variables, provides an adequate fit to the mortality surface for 
the period 1890-1990. 

The second of the two chapters argues that extrapolation of the fitted 
regression leads to forecasts that are not plausible. Coefficients in the fitted 
regression are altered in an apparently reasonable way to obtain what is called 
a predictive model, which is then used to produce forecasts by extrapolation. 
The thrust of the two chapters is that w hi lst regression methods provide 
satisfactory descriptions of the past, they are only a point of departure in the 
forecasting enterprise where other considerations may be of cmcial 
importance. 

Heathcote and Higgins situate their approach in a wider context of recent 
attempts at modelling and forecasting mortality. Models proposed by McNown 
and Rogers (1992), Bell and Monsell (1991) and Lee and Carter (1992) are 
taken as examples. Generally, these authors proceed by auto-regressive or 
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ARIMA modelling of the cross-sectional age-specific mortality vectors and 
then extrapolate to obtain projected mortality. The emphasis of Heathcote and 
Higgins' chapter is different in the sense that models for stochastic processes 
of life hi stories along diagonals of the Lexis diagram are taken as a point of 
departure. This means that birth cohorts play a prominent role and the question 
of heterogeneity can be addressed. The regression modelling as applied here is 
therefore preferred to the above time series methods. 



12.2.2. Demographic Models of Old-Age Mortality 

Mortality at old age is a cmcial component of any forecasting model for 
overall mortality. At the same time, more problems are inherent in the 
measurement and modelling of mortality at old ages than at lower ages (for a 
discussion of measurement problems see Section 12.3 in this chapter). In 
effect, more uncertainty must be expected in the forecasts of old-age mortality. 
A major model for old age mortality is the Gompertz function. However, 
using the annual age-specific death rates, one can show that the Gompertz 
model fits the data imperfectly for ages above about 80 years. Other logistic- 
like functions perform better. Examples of this type of functions are, for 
instance, the modified Gompertz as proposed by Heligman and Pollard (1980) 
and a quadratic of age for the logistic transformation of death probability (see 
Heathcote and Higgins in Chapter 3). Other recently proposed modelling 
approaches suitable for old-age mortality include the models of Coale and 
Kisker (1990) and Himes, Preston and Condran (1994). Little is known about 
the differences in these models' performance. In Chapter 6 of this volume, 
Boleslawski and Tabeau elaborate on the issue of modelling mortality at old 
age (80-I-). The main goal of this chapter is to identify methodologies best 
suited to modelling and forecasting old-age mortality in developed countries. 
The question is approached by testing several possibly appropriate 
mathematical models using alternative estimation methods as well as on the 
basis of high-quality old-age mortality data for four European low-mortality 
countries. Various pools of mortality and population data are used for model 
estimation (overall pool (i.e. aU countries and all years together), country- 
specific pools, and country- and decade-specific analyses). Two different 
criteria used in the evaluation of the models are the sum of squared residuals 
(SSR) defined on deaths and relative differences between observed and model- 
based life expectancy. 
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Two issues are investigated in more depth. Firstly, a number of models known 
to be appropriate for old-age mortality are compared on the basis of complete 
well-defined age patterns of mortality (ages 60 to 104). Secondly, the problem 
of generating non-available, or available but non-reliable, values of mortality 
rates for ages after 84 is addressed. Models used in this part of the analysis are 
estimated using trancated (i.e. incomplete) statistical information about the age 
pattern (ages 60 to 84 years). For ages beyond 84 years, the model-implied 
values of age-specific death rates are compared with the actual empirical rates 
(ex-post errors are analysed). Interestingly, models recommended as 
preferable in each of the two groups are not necessarily the same. 

When high-quality complete data are available, a large gain is usually obtained 
in the fit of three- (compared with two-) parameter models. A smaller gain is 
achieved in four- (compared with three-) parameter models. It seems that 
three-parameter models (and not necessarily four-parameter models) are 
satisfactory for a proper description of mortality, at old age. Models for 
estimating missing or deficient indicators of mortality of the oldest-old need to 
be selected on the basis of extrapolation over age using models estimated for 
ages from 60 to 84 (i.e. in most cases these are the available reliable data). 
Surprisingly enough, an optimal fit for ages 60-84 does not always imply a 
minimum deviation between the empirical and model-implied life expectancy. 
This is an important finding which indicates that this choice must be made on 
theoretical grounds. 



12.2.3 Process-Based Demographic Models of Mortality 

Forecasting mortality may be improved by using process-based models, as for 
instance those applied in event-history analysis. In Chapter 5 Willekens 
discusses the Gompertz model as a model of duration data. Several features of 
the Gompertz model are listed using the framework of three groups of models: 
Richard's family of growth models, mixture models, and trancated extreme 
value distribution models. These features may be important in the ongoing 
attempt to improve mortality forecasts. Of special interest is the model's 
relationship to the family of extreme value distributions, a family that has seen 
some applications to demographic problems but should perhaps be more 
widely known. 
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For both historical and practical reasons, the Gompertz model of adult 
mortality is perhaps the most important and widely used mathematical form in 
the field of mortality studies. Whilst some empirical studies suggest that the 
linear increase in the logarithm of the mortality rate may have to be changed to 
include a quadratic at older ages, the theoretical underpinning of the model is 
of prime interest to the study of survival distributions. In particular, it forms 
the basis for a process (as opposed to trend)-based approach to forecasting. 
The Gompertz model and its theoretical extensions belong to models that can 
improve the biological realism of mortality forecasts. Models needed for the 
future must be based on explicit descriptions of individual ageing and 
mortality. Models of heterogeneity, stochasticity and homeostatic forces in 
ageing discussed by Yashin in Chapter 1 1 are of the greatest relevance in this 
respect. Many of these models need further methodological development 
before they can be applied, one exception being the models of changing frailty 
that are available for practical use in their present form (e.g. Yashin and 
Manton, 1997). Models proposed within the theory of the evolutionary biology 
of ageing, on the other hand, seem to be far less significant for improving 
modelling and forecasting mortality. As suggested in Chapter 1 1, these models 
continue to be much less conclusive than any other theoretical model 
explaining observed mortality age patterns. 



12.2.4. Demographic Models of Mortality by Components 

Surfaces of cause-specific mortality and of overall mortality by period or 
cohort are modelled by Tabeau et al. in Chapter 7. In this chapter three 
alternative forecasting approaches are discussed: overall mortality observed by 
period, by cohort, or by cause of death. The major objective of the study is to 
give indications about the usefulness of the three alternative methodologies 
using mortality data for four European countries in the years 1950-1994 and 
cohorts bom in the 19* and 20* centuries. 

Also in this study, parameterisation was used as a modelling and forecasting 
tool. The approach applied here is a modification of the standard forecasting 
procedure based on time series modelhng (e.g. McNown and Rogers, 1992; 
Knudsen et al., 1993 and Thompson et al., 1989). In the analyses in Chapter 
7, the parameters are time-dependent and the models dynamic (Tabeau and 
Tabeau, 1995). Dynamic parameterisation functions are estimated using arrays 
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of age- and time-specific death probabilities/rates as inputs, allowing the fitting 
of a number of annual model age schedules to empirical data in one step. 

Chapter 7 stresses that several benefits can be gained from using the cause-of- 
death approach. First of all, formulating assumptions for mortality by cause is 
relatively simple. This is because the trends and patterns of mortality by cause, 
although volatile, are much clearer than those of overall mortality. The 
selection of causes of death must, however, be different for different 
population segments and must be done carefully. It is important to distinguish 
the main groups of causes, and within each group, the leading and remaining 
causes. The category 'remaining causes' should not be predicted as a separate 
cause but, for instance, as a difference between all-cause mortality and 
mortality from the causes selected for the analysis. Thus, the cause-of-death 
approach should be ideally used in conjunction with the overall mortality 
period (or cohort) approach. Note that the cause-of-death approach is not the 
best approach for forecasting mortality of the oldest-old because of the 
multiple causes of death at these ages. Another modelling method, for instance 
parameterisation of overall mortality, appears to be more suitable. 

The importance of the cause-of-death approach is related to the fact that causes 
of death help us to better understand future levels and trends in mortality. 
However, the cause-of-death approach should not be seen as a technique to 
improve the statistical quality of prediction, which is usually better in the case 
of forecasts of an aggregate such as life expectancy or the age-adjusted total 
mortality rate. It is important to note that estimates of the costs of disability 
related to chronic diseases might improve if mortality by cause of death were 
predicted more accurately. Estimates of disability costs must be based on 
(among other things) future levels of cause-specific mortality. Both 
demographic and epidemiological forecasts of mortality by cause are of 
potential use in estimations of this kind. Having said that, epidemiological and 
demographic forecasts of cause-specific mortality usually have different 
outcomes. Discrepancies observed between these two types of forecasts of 
cause-specific mortality need to be better understood. 



12.2.5. Epidemiological Models of Mortality and Morbidity 

Epidemiological models aimed at projecting mortality as one of the indicators 
of the health status of a population usually take a wider conceptual perspective 
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by including detemiinants of health (see Chapter 2 for an overview of the 
several classes of determinants). The Chronic Diseases Model, described by 
Van Genugten et al. in Chapter 8, is an example of this modelling approach. 
The Chronic Diseases Model belongs to a group of similar epidemiological 
models developed for the Netherlands: the Technology Assessment Model 
(TAM; e.g. Bonneux et al, 1994), Erasmus University Rotterdam, and the 
model PREVENT (Gunning-Schepers, 1988), University of Amsterdam. All 
these models are characterised by strict data requirements and a complex 
structure of mutually dependent equations and are of the dynamic multi-state 
life table type. These models are useful because of the various types of 
simulations which might be completed using assumptions about the model 
parameters. 

In the Chronic Diseases Model, data on risk factor prevalence and relative 
risks for associated diseases (ratios of disease incidence in exposed to non- 
exposed) are combined with cause-specific morbidity and mortality data from 
vital statistics. Transition rates for the risk factors distinguished are derived 
from time series of risk factor prevalence in the Netherlands. The population 
structure changes every year through birth, migration, mortality and 
transitions between risk factor states. Disease modules calculate incidence, 
prevalence, recovery and mortality for diseases associated with the risk factors 
involved. As a result, outcome variables can be shown not only for (cause- 
specific) mortality but also for different states of disease processes, including 
recoveries. Several risk factors and disease categories can be analysed 
simultaneously. The model output can be used to calculate several public 
health indicators such as life expectancy, healthy life expectancy, disability 
adjusted life expectancy et cetera. Effects of competing death risks can be 
shown. One of the aspects stiU to be improved is the model validation. This is 
hampered by the fact that availibility of reliable data on trends in incidence and 
prevalence is limited, even in developed countries. 

The Chronic Diseases Model is well-suited to answering integrated scientific 
and policy questions.Eor instance, it enables projection of trends in the Dutch 
population structure, based on alternative levels of risk-factor exposure. Using 
this model, the effects of potential prevention and intervention measures can be 
analysed in terms of the burden of disease that can be avoided. In this way 
epidemiological models can be a tool in defining health care policies for the 
future. 
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One of the applications of epidemiological models is the validation of mortality 
projections based on demographic models. For most causes of death, 
projecting mortality separately from other health indicators and determinants 
results in inconsistencies with prediction based on complex health-oriented 
models. An example of this application is shown by Van Genugten et al. in 
Chapter 8, where several alternative predictions of mortality from lung cancer 
and coronary heart disease in the Netherlands were li nked to trends in smoking 
and compared with the outcomes of the demographic model. The 
epidemiological and demographic methods of modelling and forecasting 
mortality presented in Chapter 8 were conceptually very different. 
Inconsistencies in the numerical mortality outcomes of these two approaches 
were therefore unavoidable, raising questions such as: would a joint 
demographic and epidemiological approach give us a better insight into future 
trends in mortality and health? An epidemiological validation of demographic 
forecasts of cause-specific mortality proved to be invaluable here. However, 
joint efforts seem to be less useful for forecasting the population age stmcture 
because epidemiological models are available for a limited number of diseases. 
In this case, the demographic procedures are well-established and trustworthy. 



12.2.6. Models of Mortality in Population Forecasting 

Chapter 9 by De Beer and Van Hoorn gives an example of mortality forecasts 
made for use in national population forecasting. The example is given for the 
Netherlands. The method of Statistics Netherlands can be characterised as a 
combination of extrapolation of death risks and interpolation of life expectancy 
values between the last observed year t and a target year t-i-n in the future. 
Note that the level of life expectancy in the target year is selected mainly on 
the basis of qualitative assumptions, whilst also using quantitative methods, 
such as time series analysis of age-specific mortality rates/probabilities/ 
quotients. Once the target level of life expectancy is known, interpolation is 
applied for life expectancy between the years t and t-i-n. Finally, age-specific 
mortality patterns are found which satisfy the levels of life expectancy as 
interpolated for the entire forecast horizon. 

In the latest official projection, the target year is 2050. Life expectancy at birth 
by 2050 is 83 for women and 80 years for men (medium variant). A large 
increase is assumed in life expectancy for men, mainly due to declines in 
mortality between 30 and 60 years, and only minor mortality reductions for 
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women. Mortality of the oldest-old is assumed to decline in the long term, as a 
result of the assumption that the effects of selection will become weaker than 
the effects of medical progress in the future. 

In the prediction method discussed in Chapter 9, the selection of the target 
level of life expectancy is cmcial. For the Netherlands, the ultimate target life 
expectancies are fairly acceptable and remain in line with the trends and 
patterns observed in Dutch mortality in the past, recent views on explanations 
of changes in mortality in the Netherlands and other developed countries, and 
expectations with respect to longevity. A proper selection of targets is, 
however, an art in itself, which is why we discuss it in more detail in Section 
12.3 of this chapter which is devoted to the choice of assumptions for 
forecasting. 

In Chapter 10, Cmijsen and Eding summarise the latest mortality forecasts in 
the countries of the European Union (EU). In fact, the material discussed in 
this chapter is derived from systematic evaluations of the population 
projections in the EU, useful to both the makers and the users of population 
projections. In 1996, for the purpose of monitoring, the same authors 
reviewed the quality of national long-term population scenarios 1991/1993 
(1991 scenarios covered the former European Community (EC), and those of 
1993, the former European Eree Trade Association (EFTA); 

EC-i-EETA=EEA-i-). More specifically, they showed how close these 
projections were to the observed figures for EEA-i- countries for 1990-1994. 
Eor mortality in almost all countries of the EEA-i-, the prevailing trend in life 
expectancy was a con t inuation of the increase observed in the 1980s. In 
general, the number of deaths according to the high scenario was close to the 
observed trend, except for the Netherlands where the low scenario appeared to 
be better than the high one. Ultimately, more than 50 per cent of the countries 
had values between the high and the low scenarios. Note that in the short run 
mortality projections were much more reliable than fertility projections. 

The above conclusion is a good starting point for investigating the most recent 
national forecasts of mortality in Europe. When doing so, one might also be 
interested in the question as to how the latest forecasting methods have evolved 
compared with those recommended. 

In Chapter 10, Cmijsen and Eding report that apart from Portugal and Greece, 
all member states of the European Union (EU) in the mid-1990s revised their 
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official national population forecasts. All countries foresee a further 
improvement in life expectancy at birth, but at a lower speed than observed 
during the last two to three decades. Denmark and Finland are most 
'pessimistic'. Both countries assume that the observed long-standing gradual 
increase in life expectancy will come to a halt in the medium term. Austria, 
Belgium and France, on the contrary, expect that the gain in life expectancy 
will systematically increase. The other EU countries assume that the increase 
will slow down. As a result, the relative national differences in life expectancy 
are expected to grow in the next decades. 

Compared with the national forecasts made around 1985, the actual 
perspectives for life expectancy in 2000 are substantially higher. In other 
words, most national forecasters became more 'optimistic' after they were 
confronted with significant projection errors in the period 1985-1995. 

Most countries apply relatively simple forecasting methods. Most common is 
the use of constant or slightly decreasing age-specific reduction factors, 
heavily based on recent trends. In the assumption-making process, some 
countries begin by guessing the level of life expectancy at birth for a target 
year in the distant future, followed by interpolation. Only Italy applied a rather 
detailed mortality component (i.e. regional) projection together with an APC 
model for all-cause mortality at the national level. The aggregate approach was 
superior to the regional-level projections. Trends in mortality by cause of 
death were used to validate the outcome of the APC model. 

Almost all countries consult experts. A few countries report international 
comparisons. However, hardly any expert opinions or basic qualitative 
assumptions underlying the model input or the results of time series analyses 
are mentioned in their publications. 

Finally, in spite of the growing uncertainty of future mortality trends and the 
strong impact of population ageing, seven countries still compile mortality 
forecasts without uncertainty variants. The countries, which make an 
'optimistic' or a more 'pessimistic' variant, apply fairly different margins. 

In conclusion, projection procedures applied by many countries are poor. The 
procedures are based on methods of low scientific standing. Assumptions are 
selected subjectively, frequently not using formal methods. The lack of 
documentation substantiating this choice gives the impression that assumptions 




292 



Tabeau, Van Den Berg Jeths and Heathcote 



are selected ad hoc. Qualitative assumptions are formulated by countries, but a 
similar rationale, i.e. si mil ar qualitative assumptions, may be (and are) 
interpreted in different ways, resulting in different quantitative assumptions. 
More transparency in assumptions and methods used is urgently needed. 
Another reason for demanding more clarity is that life expectancy in developed 
countries has reached extremely high levels and the role of changing structural 
patterns of mortality will be more decisive in the future than it was in the past. 
More studies are needed about possible developments in mortality, conditional 
on expectations related to the well-known factors influencing mortality (and 
combinations of these factors), such as age, sex, cause of death with a focus 
on avoidable premature mortality, cohort and period effects, socioeconomic 
status and marital status, and in some countries urban-mral differences in 
mortality, to name but a few. All these studies teU us more about possible 
future trends. Studying components of the mortality process is a first step 
towards creating a clearer picture of the targets or towards validating future 
trends. 



12.3 I Formulating Assumptions in Mortality Forecasting 

12.3.1. Assumptions in Predicting Mortality by Extrapolation 

Forecasters often use extrapolation in a strict statistical way. An example of 
such an approach can be found in Chapter 3 by Heathcote and Higgins. The 
prediction discussed here is based on the best statistical model of the historical 
cohort mortality in the Netherlands disentangled (by the use of dummy 
variables) from a period 'bias'. Long-term trends in Dutch female mortality 
were, however, more optimistic than the trends observed recently. 
Consequently, for women the long-term trends produced a rather optimistic 
prediction of mortality. An opposite situation took place for men. In effect, for 
both men and women, the long-term extrapolation from a descriptive model 
led to predictions that were not plausible. 

As a remedy to this situation and in order to produce more reliable forecasts, 
the authors formulated a predictive model by altering some parameters of the 
descriptive one (Chapter 4), and by giving a transparent explanation to the 
changes made. The forecasts of Dutch mortality now seem reasonable. 
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Formulating a satisfactory model for prediction is difficult. Period of 
observation is highly relevant and many forecasters base prediction on the 
recent and not distant past. Use of dummies is recommended, especially when 
analysing historical data. The most urgent question of extrapolation is, 
however, not related to the period under consideration and not to the use of 
dummies but to how the expectations one has can be incorporated in the 
prediction procedure. 

Instead of using extra forecasts as an additional input, various types of formal 
constraints can be introduced into the forecasting model. Conceptually, the 
constraints are quantitative expressions of the expectations one might have 
with respect to the future. They should be built in to the optimisation 
procedure. But in order to define the constraints, quantitative information is 
needed with respect to mortality patterns in the future and this is often quite 
difficult. Instead of introducing constraints, one may also try to incorporate 
target mortality patterns into the input data for forecasting. Bad targets would, 
however, worsen the predictive models and produce less reliable forecasts. 



12.3.2. Target Selection in Predictions by Interpolation 

Theories explaining ageing and longevity, etiology of diseases — causes of 
death, effects of period and cohort factors etc. could efficiently support the 
formulation of quantitative explanatory models and assumptions for 
forecasting. The older demographic theories, such as the theory of the 
demographic and epidemiological transitions, are mainly devoted to historical 
changes in mortality and focus on socioeconomic and cultural determinants. 
More recent demographic theories include concepts such as selective survival 
(also called unobserved heterogeneity), insult accumulation, kinetics of human 
ageing, law of di min ishing returns, and rectangularisation of the survival curve 
(a brief review in Tabeau, 1997). The concepts of selective survival and 
unobserved heterogeneity belong to the most important theoretical frameworks 
explaining mechanisms of mortality in the ageing populations of developed 
countries (Shepard and Zeckhauser, 1975; Vaupel et al, 1979; Manton et al, 
1981 and Vaupel and Yashin, 1985). The major hypothesis of selective 
survival is that populations are made up of individuals with varying mortality 
risks, i.e. heterogeneous individuals with different (genetically determined) 
endowments for longevity. Under the assumption that environmental 
conditions remain constant throughout the life-span, those with short longevity 
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endowments, or who are frailer and more susceptible to i ll ness would die first, 
to be followed in succession by individuals with progressively more favourable 
prospects for longevity. What rema in s at older ages is the subgroup of the 
original cohort that contained the most favourable endowment for longevity. 
The composition of the population, consequently, must change with age. At 
each age, the population includes a smaller proportion of highly susceptible 
individuals. Selection mecha ni sms eli min ate a frail population, leaving a less 
frail population to survive. The impact of selection varies, however, in 
successive cohorts. Rising life expectancy (or declining mortality rates) allows 
a higher proportion of the original cohort to survive to each older age but this 
also tends to increase the number of high-risk frail individuals. It spares some 
relatively frail individuals who were the most likely to die under previous 
conditions. As mortality declines, these 'new' or 'marginal' survivors raise the 
average level of frailty and mortality at older ages. 

It is typical of the theory of selective survival, as well as of other demographic 
theories, that its mecha ni sms operate at the level of an individual. A more 
specific explanation is most typically left to other disciplines, for in stance to 
gerontological research that explains individual ageing as a result of a failure 
of function at one or more levels of the physiological integration of the 
organism: molecular, cellular and organismal. Despite the many explanatory 
theories of ageing that have become available, explanatory models for 
mortality are scarce and at the present time their contribution to setting 
quantitative targets is limited. 

Nevertheless, targets are constantly used in predictions of mortality, especially 
those made by interpolation. Some ways of improving the process of target 
selection are suggested by De Beer and Van Hoorn in Chapter 9. Statistics 
Netherlands (SN) use the target interpolation method in the preparation of the 
official forecasts of mortality in the Netherlands. SN use qualitative 
assumptions based on mortality theories, especially on the theory of selective 
survival, and relevant empirical results to define the level of a target life 
expectancy and, using trend extrapolation of age-specific death rates, to find 
the age patterns matching the selected target. The target selection could, 
however, be better justified if a quantitative method explicitly showed 
particular changes in relevant determinants of mortality on which the selection 
was conditional. In such experiments, explanatory models for mortality 
formulated in terms of relevant physiological and behavioural risk factors, by 
cause of death, and estimated on the basis of longitudinal data about 
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individuals, would prove invaluable. Examples of this type exist for the United 
States (Manton et al, 1991). 

With respect to target selection for use in interpolation, the importance of 
disaggregation is worth mentioning. Disaggregation is a method of dealing 
with the heterogeneity of the population. It improves the understanding of 
mortality change since reasons for changes in mortality can be better 
hypothesised from changes in the components of the process such as sex, age 
and cause of death. In fact, analysing the components is a compromise 
between the two extreme approaches, the process-oriented explanation and the 
widely used trend extrapolation approach. 

In addition to disaggregation, symptomatic models of mortality, such as 
parameterisation models, Lee-Carter type, and age-period-cohort (APC), offer 
several possibilities for target selection. In these models, no information about 
explanatory variables is included directly in the model stmcture but instead the 
age, period and cohort parameters are used as indicators of a change in 
mortality implied by a change in a relevant factor. Simulated rates of change 
can be validated with epidemiologists. 

Summing up, the demographic forecasting models based on age, period, 
cohort and cause-of-death effects could be more widely used to make aware 
the policy makers and other users of mortality forecasts of the constraints of 
reaching target levels of life expectancy. An informed quantitative speculation 
about the effects of a factor included in a model is required, where arguments 
are made explicit and transparent. 



12.3.3. Assumptions in Health Forecasting: Epidemiological Scenarios 

The importance of theory is particularly clear in the epidemiological models of 
public health. Each equation in these models is specified on the basis of 
epidemiological knowledge of disease etiology and relationships between risk 
factor prevalences, disease incidences, and mortality probabilities. These 
models are therefore particularly suited to simulating mortality outcomes that 
are conditional on changes in various model components. In this context, 
formulating assumptions can be improved by using scenario techniques. 

In Chapter 8, based on the Chronic Diseases Model stmcture, three scenarios 
of smoking levels are presented — high, low and medium — resulting in 
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changes in mortality from lung cancer and from coronary heart disease. The 
results obtained for lung cancer mortality are especially informative. The 
patterns for men and women are quite different. For men, the (very) low 
smoking scenario was the one which satisfied external lung cancer mortality as 
forecasted by a demographic extrapolation approach. For women, an 
extrapolation of the trend in smoking behaviour resulted in lower mortality 
patterns than was forecasted by extrapolation of the mortality trend from lung 
cancer. Consequently, if trend extrapolation for lung cancer mortality serves 
as the model input, substantial changes in current levels of smoking are 
needed. For women, these changes would be contradictory to the goals of 
prevention programmes. 

Using external (e.g. official and/or alternative) mortality forecasts and external 
(e.g. official and/or alternative) population forecasts as an input for 
epidemiological models may help to formulate respective epidemiological 
scenarios (i.e. sets of epidemiological indicators such as incidence, prevalence 
and case fatality rates) as an aid to both model building and setting target 
values. 



12.4 I Data Needs in Mortality Forecasting 

Developed countries have the luxury of having extensive records of mortality 
and population data. This does not necessarily mean that the data needs of 
mortality forecasting have been satisfied. In this section, data collection and 
measurement problems in mortality forecasting are discussed from the 
perspective of this book. Most of the work presented in this volume was based 
on aggregated population-level data, with major problems related to the quality 
of old-age statistics and consistency of time series on mortality by cause. We 
shall therefore focus on these two issues here. Note that mortality forecasts 
would improve by modelling individual-level longitudinal data describing 
physiological processes of ageing and other important determinants of health, 
and availability of these data will hamper or enable process-based mortality 
forecasts in the future. Specific problems related to the availability of 
individual-level longitudinal data on explanatory variables for mortality are left 
for discussion elsewhere because the data sources and availability differ greatly 
among countries. For the Netherlands, the interested reader can refer to 
Ruwaard and Kramers (1997 and 1998). 
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12.4.1. Data Needs in Forecasting Old-Age Mortality 

The age patterns of old-age mortality, and in particular of mortality of the 
oldest-old, should be analysed with caution. Chapter 6 is devoted to some 
extent to the measurement of oldest-old mortality, the major problem being 
that errors in data play a much more serious role in the case of the elderly than 
in other large groups of the population. Errors are present in both death and 
population statistics. One source of errors in death statistics is the general 
tendency to overstate the age at death of elderly persons, in particular of 
centenarians, by their family members. Apart from age overstatement, heaping 
at round ages such as 100 or 105 is also observed. Unreliable population 
statistics largely result from errors in death statistics that served to estimate the 
population age stracture in intercensal periods in the past. The effects of errors 
are felt most in small population groups such as centenarians. To eliminate 
these errors, two steps are usually taken: first, death statistics are improved, 
and then population numbers are reestimated using the new numbers of deaths. 
France is an example of a country where much has already been done to 
complete historical mortality data and to improve the quality of these data. An 
overview of 19*-centuiy data, their sources, operations performed on the data, 
and a critical evaluation of data quality was made by Mesle and Vallin (1989). 
20“-century mortality data, including data for the oldest-old, were discussed 
and validated by Mesle and Vallin (1989), Wilmoth et al. (1989), and Caselli 
et al. (1987). Much remains to be done to improve the population numbers. 

For several developed countries, improved death and population statistics lie at 
the basis of the development by Kannisto and Thatcher of the Oldest-old 
Mortality Data Base, an international project coordinated by Vaupel at the 
Odense University Medical School in Denmark. The data from this collection 
also meet the criterion of reliability and consistency for all the countries 
included. 

Estimating the force of mortality is difficult unless longitudinal data are 
available. It is common practice to use the rate of mortality as an estimate of 
mortality force. In the case of mortality after the age of 80, this does not yield 
satisfactory results. 

The mortality rate represents a number of events (i.e. deaths) divided by the 
exposure time. It is useful to distinguish between the exposure time and an 
approximation of the exposure time. When longitudinal data are available. 
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exposure time can be measured precisely. However, in most cases only 
grouped data are available. In that case, an approximation is the product of the 
mid-year population and the length of the observation period (so-called person- 
years). This results in a (piecewise) linear survival function. Other 
approximations are also known, for example the approximation implied by the 
assumption that the instantaneous rate is (piecewise) constant, leading to an 
exponential survival function, or the Weibull distribution assumption. 

An approximation of the exposure time based on the mid-year population is 
particularly improper for the elderly population in which deaths occur much 
more rapidly with time than at younger ages. Old-age mortality is likely to be 
underestimated using the mid-year population as the estimate of exposure time. 
In effect, a levelling off of mortality is sometimes indicated for the highest 
ages, which, however, might be partly related to measurement deficiencies 
rather than to the tme phenomenon of levelling off. The suitability of the 
Gompertz model at advanced ages is then questioned but this is an issue that 
will not be settled until good, preferably longitudinal, data become widely 
available. 

Another measurement issue is related to the classification of deaths. A Lexis 
diagram offers thr ee different perspectives for the classification, age-period, 
period-cohort and age-cohort. The period-cohort observations are not suitable 
for analysing old-age mortality, mainly because of the rapid age-related 
increase in mortality of the old. The two remaining classification schemes are 
preferable. Note that each of the three classification options can be used to 
reconstmct mortality experience in two conceptually different perspectives, by 
cohort and by period. Longitudinal (i.e. cohort) studies of mortality help to 
formulate theoretical models of mortality and are therefore of the greatest 
interest, but period patterns of mortality are commonly available. Differences 
between period and cohort life expectancies can best be summarised in the 
concept of 'a mean life duration', which is a measure based partly on cohort 
and partly on period data (Brouard, presentation at the NIDI workshop, 1997). 
For example, in 1994 the oldest members of the cohort born in 1900 (in actual 
fact, in 1899-1900 if the a-p observational plan is used) were 94 years old. 
Mortality of those younger than 94 years can be estimated from observations 
for the years from 1900 up to 1994. However, mortality of cohort members 
older than 94 was still unknown in 1994. The missing observations of 
mortality of those older than 94 can be approximated by using mortality rates 
after age 94 as observed in the calendar year 1994. This mixture of cohort and 
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period death rates can be used to calculate a life table resulting in the life 
expectancy which is called the 'mean life duration'. The mortality measures 
underlying the mean life duration can then help to solve problems of 
incomplete cohorts. 



12.4.2. Data Needs in Forecasting Mortality by Cause of Death 

Cause-of-death data are usually processed and published as age-period figures. 
Occasionally, data on causes of death are available from the national statistical 
offices as age-period-cohort numbers which, together with the population age 
structure, could be used to calculate cohort measures of cause-specific 
mortality. Cohort mortality rates by cause of death are only rarely processed 
and published by statistical offices. One important reason for this are the 
changes in successive Revisions of the International Classification of Diseases 
(ICD), as a result of which causes of death in one cohort are reported in the 
coding of different ICD Revisions. For instance, the cohort born in 1900 
experienced all nine revisions of the ICD system and will be included in the 
tenth. This problem can be solved by reclassifying deaths by cause. France is a 
country where reclassified data could serve to reconstmct rough estimates 
(based on five-year age groups) of cohort and cause-specific mortality since 
1925 (VaUin and Mesle, 1990). The Netherlands is another country with 
reclassified data (since 1875) about a relatively small number of mainly 
historically relevant causes of death (Van den Bosch et ah, 1994). For recent 
years, i.e. since 1950, (single-year) age-, period-, and cohort-specific data for 
causes of death could be obtained in a few countries with good mortality 
statistics, such as the Netherlands. These data are not easily accessible, firstly 
because they require time-consuming and expensive preparation, and secondly 
because strict data protection regulations have to be observed. In practice, 
therefore, data on mortality by cause of death are available in the age-period 
classification. 

Working with age-period data does not, however, solve the problem of 
changes in the ICD. The World Health Organisation (WHO) offers a database 
that can be consistently used as a source of data for forecasting cause of death- 
specific mortality at the national and international levels. The WHO data on 
causes of death are available in a uniform and consistent format. For the sake 
of consistency, an international format of the medical death certificate was 
recommended by the WHO and is used in almost all countries. In addition. 
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causes of death are classified on the basis of these certificates in accordance 
with the ICD. However, in practice many inconsistencies exist within each 
country and even more so among countries. The most important sources of 
inconsistencies include decennial revisions of the ICD, differences in national 
interpretations of international regulations, and differences in the qualifications 
and training of physicians and the staff who code the causes reported on death 
certificates (e.g. Rosenberg, 1993 and Mesle, 1994). In order to properly link 
the data from different ICD Revisions, conversion tables must be developed 
and used. The ultimate links between subsequent ICDs should be ideally 
expressed in terms of the 3- and 4-digit basic categories and not necessarily as 
the (aggregated) categories of the WHO short tabulation lists. On the basis of 
conversion tables, aggregation algorithms can be developed to produce 
consistent time series of cause-specific mortality data. However, despite the 
careful procedures noted above, inconsistencies are likely to be found in 
comparative international studies, especially between ICD-7 and ICD-8, and 
more recently between ICD-9 and ICD- 10. As a result of these distortions, 
breaks in the time series occur, implying that for some causes of death shorter 
time series must be modelled. 



12.5 I Concluding Remarks 

The thmst of this book is that satisfactory forecasting requires the integration 
of perspectives and techniques from the disciplines of statistics, demography 
and epidemiology. Whilst individual forecasters are often dead by the time 
their work is put to the test, forecasting efforts are, and should be, embarked 
on with the utmost care. This sense of gravity is rooted in the realisation that 
failure is almost inevitable and that it is a matter of mi ni mi sing future damage 
since the future can never be predicted with absolute certainty. In addition, we 
face the practical consequences that may stem from inappropriate long-term 
planning of health and welfare services based on forecasts that may prove to 
be incorrect. There is also a discrepancy between the knowledge required to 
improve health policies and social programmes for the elderly in ageing 
populations and what can be anticipated about future developments in health 
and mortality. The discrepancy raises questions about how to reduce the gap 
between research and current forecasting methods. This is a question that has 
been central to the chapters in this book and its resolution requires the 
interdisciplinary efforts referred to. 
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Multistate morbidity and disability models with inputs from statistics, 
demography and epidemiology will, we believe, play an increasingly 
important role in future studies. Whilst there are important theoretical issues to 
be explored, a cmcial stumbling block is likely to be the scarcity of 
satisfactory longitudinal data. Our final comment is that progress along the 
multidisciplinary path advocated will be hindered, perhaps seriously, if we fail 
to address this practical issue. 
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